๋ณธ๋ฌธ ๋ฐ”๋กœ๊ฐ€๊ธฐ
โ–ช Research

Getting Started with Open API (2)

by soychoi 2024. 8. 1.
728x90
๋ฐ˜์‘ํ˜•
Open API (Application Programming Interface) ์ดํ•ดํ•˜๊ธฐ - XML to Excel
data.go.kr

 

 

API ๊ตฌ์กฐ ์ดํ•ดํ•˜๊ธฐ

 

 

 

์ œ๊ฐ€ ์˜คํ”ˆ API ํ™œ์šฉ์‹ ์ฒญ์„ ํ•œ ๋ฐ์ดํ„ฐ๋ช…์€ "ํ–‰์ •์•ˆ์ „๋ถ€_์ง€์—ญ๋ณ„ ์ธ๊ตฌ์ด๋™ ํ˜„ํ™ฉ" ์ž…๋‹ˆ๋‹ค.

๋งˆ์ดํŽ˜์ด์ง€ > ๋ฐ์ดํ„ฐ ํ™œ์šฉ > Open API > ํ™œ์šฉ์‹ ์ฒญ ํ˜„ํ™ฉ > ์•„๋ž˜ ํ™œ์šฉ์‹ ์ฒญํ•œ ๋ชฉ๋ก ์„ ํƒ > ๊ฐœ๋ฐœ๊ณ„์ • ์ƒ์„ธ๋ณด๊ธฐ

 

 

 

 

 

 

 

[๊ฐœ๋ฐœ๊ณ„์ • ์ƒ์„ธ๋ณด๊ธฐ] ์—์„œ API ํ˜ธ์ถœ URL ์ฃผ์†Œ์ธ End Point ๋ฐ API ํŒŒ๋ผ๋ฏธํ„ฐ ํ™•์ธ

 

 

1. End Point | API ํ˜ธ์ถœ URL ์ฃผ์†Œ ํ™•์ธ

 

http://apis.data.go.kr/1741000/ppltnDataStus

  • http://apis.data.go.kr/1741000/ppltnDataStus : Base URL
  • http://apis.data.go.kr/1741000/ppltnDataStus : ํ˜ธ์ถœ์ฃผ์†Œ

 

 

2. serviceKey | ์ผ๋ฐ˜ ์ธ์ฆํ‚ค (Decoding) ํ™•์ธ

 

 

 

3. API ํŒŒ๋ผ๋ฏธํ„ฐ | ์š”์ฒญ๋ณ€์ˆ˜(Request Parameter) ํ™•์ธ

 

 

์š”์ฒญ๋ณ€์ˆ˜์˜ serviceKey์— ์ผ๋ฐ˜ ์ธ์ฆํ‚ค(Decoding)์„ ์ž…๋ ฅํ•˜๊ณ  "๋ฏธ๋ฆฌ๋ณด๊ธฐ"๋ฅผ ๋ˆ„๋ฅด๋ฉด ์•„๋ž˜์™€ ๊ฐ™์ด API ํ˜ธ์ถœ ๊ฒฐ๊ณผ๋ฅผ ํ™•์ธํ•  ์ˆ˜ ์žˆ์Šต๋‹ˆ๋‹ค.

 

 

 

Google Colab์„ ํ†ตํ•œ Python ์ฝ”๋“œ ์‹คํ–‰ ์ „ XML ๋‹ค์šด ๋ฐ›๊ธฐ

 

 

1. ๋ฐ์ดํ„ฐ ์กฐํšŒํ•˜๊ธฐ๋กœ ๋ณ€์ˆ˜๋ช… ํ™•์ธ

 

 

2. Open API ์‹คํ–‰ ์ค€๋น„

 

 

3. Open API ํ˜ธ์ถœํ•œ XML ๋‹ค์šด๋กœ๋“œ

 

 

 

Google Colab์„ ํ†ตํ•œ Python ์ฝ”๋“œ ์‹คํ–‰

 

 

1. ๊ตฌ๊ธ€ ์ฝ”๋žฉ์— ๋‹ค์šด ๋ฐ›์€ xml ํŒŒ์ผ ์—…๋กœ๋“œ ๋ฐ ๊ฒฝ๋กœ ๋ณต์‚ฌ

 

 

2. ์—…๋กœ๋“œํ•œ ํŒŒ์ผ ๋ฐ ๋‹ค์šด๋กœ๋“œ๋ฅผ ์œ„ํ•œ ๊ฒฝ๋กœ ๋ถ™์—ฌ๋„ฃ๊ณ  ์ฝ”๋“œ ์‹คํ–‰

import pandas as pd
import xml.etree.ElementTree as ET

# Parse the XML file
tree = ET.parse('/content/sample_data/response_1722494619624.xml')
root = tree.getroot()

# Create a list to store the data
data = []

# Iterate through each 'item' element in the XML
for item in root.findall('.//item'):
record = {}
for child in item:
record[child.tag] = child.text
data.append(record)

# Convert the list of dictionaries to a DataFrame
df = pd.DataFrame(data)

# Save the DataFrame to an Excel file
df.to_excel('/content/sample_data/api_data.xlsx', index=False)

print("Data successfully saved to api_data.xlsx")

 

๊ฒฐ๊ณผ๋ฌผ : 

 

 

3. '๋ฐ์ดํ„ฐ ์กฐํšŒํ•˜๊ธฐ'๋กœ ๋ณธ ๊ตญ๋ฌธ ๋ณ€์ˆ˜๋ช…์œผ๋กœ ์ €์žฅํ•˜๊ณ  ์‹ถ์–ด ์•„๋ž˜ ์ฝ”๋“œ๋ฅผ ์ถ”๊ฐ€ํ•˜์˜€์Šต๋‹ˆ๋‹ค.

 

์ฝ”๋“œ:

import pandas as pd
import xml.etree.ElementTree as ET

# Parse the XML file
tree = ET.parse('/content/sample_data/response_1722494619624.xml')
root = tree.getroot()

# Define the column names
columns = [
"ํ†ต๊ณ„๋…„์›”", "์ „์ž…ํ–‰์ •๊ธฐ๊ด€์ฝ”๋“œ", "์ „์ถœํ–‰์ •๊ธฐ๊ด€์ฝ”๋“œ", "์ „์ž…์‹œ๋„๋ช…", "์ „์ž…์‹œ๊ตฐ๊ตฌ๋ช…", "์ „์ž…ํ–‰์ •๋™๋ช…",
"์ „์ถœ์‹œ๋„๋ช…", "์ „์ถœ์‹œ๊ตฐ๊ตฌ๋ช…", "์ „์ถœํ–‰์ •๋™๋ช…", "์ด์ธ๊ตฌ์ˆ˜", "๋‚จ์ž์ธ๊ตฌ์ˆ˜", "์—ฌ์ž์ธ๊ตฌ์ˆ˜", "๋งŒ0์„ธ๋‚จ์ž",
"๋งŒ1์„ธ๋‚จ์ž", "๋งŒ2์„ธ๋‚จ์ž", "๋งŒ3์„ธ๋‚จ์ž", "๋งŒ4์„ธ๋‚จ์ž", "๋งŒ5์„ธ๋‚จ์ž", "๋งŒ6์„ธ๋‚จ์ž", "๋งŒ7์„ธ๋‚จ์ž",
"๋งŒ8์„ธ๋‚จ์ž", "๋งŒ9์„ธ๋‚จ์ž", "๋งŒ10์„ธ๋‚จ์ž", "๋งŒ11์„ธ๋‚จ์ž", "๋งŒ12์„ธ๋‚จ์ž", "๋งŒ13์„ธ๋‚จ์ž", "๋งŒ14์„ธ๋‚จ์ž",
"๋งŒ15์„ธ๋‚จ์ž", "๋งŒ16์„ธ๋‚จ์ž", "๋งŒ17์„ธ๋‚จ์ž", "๋งŒ18์„ธ๋‚จ์ž", "๋งŒ19์„ธ๋‚จ์ž", "๋งŒ20์„ธ๋‚จ์ž", "๋งŒ21์„ธ๋‚จ์ž",
"๋งŒ22์„ธ๋‚จ์ž", "๋งŒ23์„ธ๋‚จ์ž", "๋งŒ24์„ธ๋‚จ์ž", "๋งŒ25์„ธ๋‚จ์ž", "๋งŒ26์„ธ๋‚จ์ž", "๋งŒ27์„ธ๋‚จ์ž", "๋งŒ28์„ธ๋‚จ์ž",
"๋งŒ29์„ธ๋‚จ์ž", "๋งŒ30์„ธ๋‚จ์ž", "๋งŒ31์„ธ๋‚จ์ž", "๋งŒ32์„ธ๋‚จ์ž", "๋งŒ33์„ธ๋‚จ์ž", "๋งŒ34์„ธ๋‚จ์ž", "๋งŒ35์„ธ๋‚จ์ž",
"๋งŒ36์„ธ๋‚จ์ž", "๋งŒ37์„ธ๋‚จ์ž", "๋งŒ38์„ธ๋‚จ์ž", "๋งŒ39์„ธ๋‚จ์ž", "๋งŒ40์„ธ๋‚จ์ž", "๋งŒ41์„ธ๋‚จ์ž", "๋งŒ42์„ธ๋‚จ์ž",
"๋งŒ43์„ธ๋‚จ์ž", "๋งŒ44์„ธ๋‚จ์ž", "๋งŒ45์„ธ๋‚จ์ž", "๋งŒ46์„ธ๋‚จ์ž", "๋งŒ47์„ธ๋‚จ์ž", "๋งŒ48์„ธ๋‚จ์ž", "๋งŒ49์„ธ๋‚จ์ž",
"๋งŒ50์„ธ๋‚จ์ž", "๋งŒ51์„ธ๋‚จ์ž", "๋งŒ52์„ธ๋‚จ์ž", "๋งŒ53์„ธ๋‚จ์ž", "๋งŒ54์„ธ๋‚จ์ž", "๋งŒ55์„ธ๋‚จ์ž", "๋งŒ56์„ธ๋‚จ์ž",
"๋งŒ57์„ธ๋‚จ์ž", "๋งŒ58์„ธ๋‚จ์ž", "๋งŒ59์„ธ๋‚จ์ž", "๋งŒ60์„ธ๋‚จ์ž", "๋งŒ61์„ธ๋‚จ์ž", "๋งŒ62์„ธ๋‚จ์ž", "๋งŒ63์„ธ๋‚จ์ž",
"๋งŒ64์„ธ๋‚จ์ž", "๋งŒ65์„ธ๋‚จ์ž", "๋งŒ66์„ธ๋‚จ์ž", "๋งŒ67์„ธ๋‚จ์ž", "๋งŒ68์„ธ๋‚จ์ž", "๋งŒ69์„ธ๋‚จ์ž", "๋งŒ70์„ธ๋‚จ์ž",
"๋งŒ71์„ธ๋‚จ์ž", "๋งŒ72์„ธ๋‚จ์ž", "๋งŒ73์„ธ๋‚จ์ž", "๋งŒ74์„ธ๋‚จ์ž", "๋งŒ75์„ธ๋‚จ์ž", "๋งŒ76์„ธ๋‚จ์ž", "๋งŒ77์„ธ๋‚จ์ž",
"๋งŒ78์„ธ๋‚จ์ž", "๋งŒ79์„ธ๋‚จ์ž", "๋งŒ80์„ธ๋‚จ์ž", "๋งŒ81์„ธ๋‚จ์ž", "๋งŒ82์„ธ๋‚จ์ž", "๋งŒ83์„ธ๋‚จ์ž", "๋งŒ84์„ธ๋‚จ์ž",
"๋งŒ85์„ธ๋‚จ์ž", "๋งŒ86์„ธ๋‚จ์ž", "๋งŒ87์„ธ๋‚จ์ž", "๋งŒ88์„ธ๋‚จ์ž", "๋งŒ89์„ธ๋‚จ์ž", "๋งŒ90์„ธ๋‚จ์ž", "๋งŒ91์„ธ๋‚จ์ž",
"๋งŒ92์„ธ๋‚จ์ž", "๋งŒ93์„ธ๋‚จ์ž", "๋งŒ94์„ธ๋‚จ์ž", "๋งŒ95์„ธ๋‚จ์ž", "๋งŒ96์„ธ๋‚จ์ž", "๋งŒ97์„ธ๋‚จ์ž", "๋งŒ98์„ธ๋‚จ์ž",
"๋งŒ99์„ธ๋‚จ์ž", "๋งŒ100์„ธ๋‚จ์ž", "๋งŒ101์„ธ๋‚จ์ž", "๋งŒ102์„ธ๋‚จ์ž", "๋งŒ103์„ธ๋‚จ์ž", "๋งŒ104์„ธ๋‚จ์ž", "๋งŒ105์„ธ๋‚จ์ž",
"๋งŒ106์„ธ๋‚จ์ž", "๋งŒ107์„ธ๋‚จ์ž", "๋งŒ108์„ธ๋‚จ์ž", "๋งŒ109์„ธ๋‚จ์ž", "๋งŒ110์„ธ๋‚จ์ž", "๋งŒ0์„ธ์—ฌ์ž", "๋งŒ1์„ธ์—ฌ์ž",
"๋งŒ2์„ธ์—ฌ์ž", "๋งŒ3์„ธ์—ฌ์ž", "๋งŒ4์„ธ์—ฌ์ž", "๋งŒ5์„ธ์—ฌ์ž", "๋งŒ6์„ธ์—ฌ์ž", "๋งŒ7์„ธ์—ฌ์ž", "๋งŒ8์„ธ์—ฌ์ž", "๋งŒ9์„ธ์—ฌ์ž",
"๋งŒ10์„ธ์—ฌ์ž", "๋งŒ11์„ธ์—ฌ์ž", "๋งŒ12์„ธ์—ฌ์ž", "๋งŒ13์„ธ์—ฌ์ž", "๋งŒ14์„ธ์—ฌ์ž", "๋งŒ15์„ธ์—ฌ์ž", "๋งŒ16์„ธ์—ฌ์ž",
"๋งŒ17์„ธ์—ฌ์ž", "๋งŒ18์„ธ์—ฌ์ž", "๋งŒ19์„ธ์—ฌ์ž", "๋งŒ20์„ธ์—ฌ์ž", "๋งŒ21์„ธ์—ฌ์ž", "๋งŒ22์„ธ์—ฌ์ž", "๋งŒ23์„ธ์—ฌ์ž",
"๋งŒ24์„ธ์—ฌ์ž", "๋งŒ25์„ธ์—ฌ์ž", "๋งŒ26์„ธ์—ฌ์ž", "๋งŒ27์„ธ์—ฌ์ž", "๋งŒ28์„ธ์—ฌ์ž", "๋งŒ29์„ธ์—ฌ์ž", "๋งŒ30์„ธ์—ฌ์ž",
"๋งŒ31์„ธ์—ฌ์ž", "๋งŒ32์„ธ์—ฌ์ž", "๋งŒ33์„ธ์—ฌ์ž", "๋งŒ34์„ธ์—ฌ์ž", "๋งŒ35์„ธ์—ฌ์ž", "๋งŒ36์„ธ์—ฌ์ž", "๋งŒ37์„ธ์—ฌ์ž",
"๋งŒ38์„ธ์—ฌ์ž", "๋งŒ39์„ธ์—ฌ์ž", "๋งŒ40์„ธ์—ฌ์ž", "๋งŒ41์„ธ์—ฌ์ž", "๋งŒ42์„ธ์—ฌ์ž", "๋งŒ43์„ธ์—ฌ์ž", "๋งŒ44์„ธ์—ฌ์ž",
"๋งŒ45์„ธ์—ฌ์ž", "๋งŒ46์„ธ์—ฌ์ž", "๋งŒ47์„ธ์—ฌ์ž", "๋งŒ48์„ธ์—ฌ์ž", "๋งŒ49์„ธ์—ฌ์ž", "๋งŒ50์„ธ์—ฌ์ž", "๋งŒ51์„ธ์—ฌ์ž",
"๋งŒ52์„ธ์—ฌ์ž", "๋งŒ53์„ธ์—ฌ์ž", "๋งŒ54์„ธ์—ฌ์ž", "๋งŒ55์„ธ์—ฌ์ž", "๋งŒ56์„ธ์—ฌ์ž", "๋งŒ57์„ธ์—ฌ์ž", "๋งŒ58์„ธ์—ฌ์ž",
"๋งŒ59์„ธ์—ฌ์ž", "๋งŒ60์„ธ์—ฌ์ž", "๋งŒ61์„ธ์—ฌ์ž", "๋งŒ62์„ธ์—ฌ์ž", "๋งŒ63์„ธ์—ฌ์ž", "๋งŒ64์„ธ์—ฌ์ž", "๋งŒ65์„ธ์—ฌ์ž",
"๋งŒ66์„ธ์—ฌ์ž", "๋งŒ67์„ธ์—ฌ์ž", "๋งŒ68์„ธ์—ฌ์ž", "๋งŒ69์„ธ์—ฌ์ž", "๋งŒ70์„ธ์—ฌ์ž", "๋งŒ71์„ธ์—ฌ์ž", "๋งŒ72์„ธ์—ฌ์ž",
"๋งŒ73์„ธ์—ฌ์ž", "๋งŒ74์„ธ์—ฌ์ž", "๋งŒ75์„ธ์—ฌ์ž", "๋งŒ76์„ธ์—ฌ์ž", "๋งŒ77์„ธ์—ฌ์ž", "๋งŒ78์„ธ์—ฌ์ž", "๋งŒ79์„ธ์—ฌ์ž",
"๋งŒ80์„ธ์—ฌ์ž", "๋งŒ81์„ธ์—ฌ์ž", "๋งŒ82์„ธ์—ฌ์ž", "๋งŒ83์„ธ์—ฌ์ž", "๋งŒ84์„ธ์—ฌ์ž", "๋งŒ85์„ธ์—ฌ์ž", "๋งŒ86์„ธ์—ฌ์ž",
"๋งŒ87์„ธ์—ฌ์ž", "๋งŒ88์„ธ์—ฌ์ž", "๋งŒ89์„ธ์—ฌ์ž", "๋งŒ90์„ธ์—ฌ์ž", "๋งŒ91์„ธ์—ฌ์ž", "๋งŒ92์„ธ์—ฌ์ž", "๋งŒ93์„ธ์—ฌ์ž",
"๋งŒ94์„ธ์—ฌ์ž", "๋งŒ95์„ธ์—ฌ์ž", "๋งŒ96์„ธ์—ฌ์ž", "๋งŒ97์„ธ์—ฌ์ž", "๋งŒ98์„ธ์—ฌ์ž", "๋งŒ99์„ธ์—ฌ์ž", "๋งŒ100์„ธ์—ฌ์ž",
"๋งŒ101์„ธ์—ฌ์ž", "๋งŒ102์„ธ์—ฌ์ž", "๋งŒ103์„ธ์—ฌ์ž", "๋งŒ104์„ธ์—ฌ์ž", "๋งŒ105์„ธ์—ฌ์ž", "๋งŒ106์„ธ์—ฌ์ž", "๋งŒ107์„ธ์—ฌ์ž",
"๋งŒ108์„ธ์—ฌ์ž", "๋งŒ109์„ธ์—ฌ์ž", "๋งŒ110์„ธ์—ฌ์ž"
]

# Create a list to store the data
data = []

# Iterate through each 'item' element in the XML
for item in root.findall('.//item'):
record = {}
for child in item:
record[child.tag] = child.text
data.append(record)

# Convert the list of dictionaries to a DataFrame
df = pd.DataFrame(data)

# Reorder and rename the columns
df.columns = columns[:len(df.columns)]

# Save the DataFrame to an Excel file
df.to_excel('/content/sample_data/api_data.xlsx', index=False)

print("Data successfully saved to api_data.xlsx")

 

๊ฒฐ๊ณผ๋ฌผ:

 

 

ํ•œ๊ณ„์ 

 

1. ๊ณต๊ณต๋ฐ์ดํ„ฐํฌํ„ธ์˜ '๋ฐ์ดํ„ฐ ์กฐํšŒํ•˜๊ธฐ'์™€ API ๋ชฉ๋ก ์•„๋ž˜์— ์žˆ๋Š” 'OpenAPI ์‹คํ–‰ ์ค€๋น„'๋ฅผ ํ†ตํ•œ XML ํŒŒ์ผ์„ ๋‹ค์šด ๋ฐ›์€ ํ›„์—์•ผ ํŒŒ์ด์ฌ์„ ํ†ตํ•œ ํŒŒ์ผ๋ฐ์ดํ„ฐ ์ €์žฅ์ด ์™„๋ฃŒ๋˜์—ˆ๋Š”๋ฐ, ๊ทธ ์ „์— ๋ฐ”๋กœ ์‹คํ–‰ํ•˜์˜€์„ ๋•Œ ๊ณ„์† "HTTP Error: 500 Internal Server Error" ์„œ๋ฒ„ ์˜ค๋ฅ˜๊ฐ€ ๋ฐœ์ƒํ•จ

 

2. ํ–‰์ •์•ˆ์ „๋ถ€์—์„œ ์ œ๊ณตํ•˜๋Š” "์ง€์—ญ๋ณ„ ์ธ๊ตฌ์ด๋™ ํ˜„ํ™ฉ"์˜ ์ „์ž…-์ „์ถœํ–‰์ •๊ตฌ์—ญ์ด ๊ฐ™์€ ์‹œ๋„ ์•ˆ์—์„œ ํ•œ์ •๋˜์–ด, ์‹œ๊ตฐ๊ตฌ๋ณ„ ์ธ๊ตฌ์ด๋™์„ ์ผ๊ด„์ ์œผ๋กœ ๋ณผ ์ˆ˜ ์—†์œผ๋ฉฐ ์กฐํšŒ ๊ธฐ๊ฐ„๋„ ๋…„์ด ์•„๋‹Œ ์›”๋ณ„๋กœ 3๊ฐœ์›”์— ํ•œ์ •๋˜์–ด ์žˆ์–ด, ๋ฐ์ดํ„ฐ ํš๋“์— ๋“œ๋Š” ์‹œ๊ฐ„์ด ๋„ˆ๋ฌด ํด ๊ฒƒ ๊ฐ™์•„ ํ•ด๋‹น ๊ณต๊ณต๋ฐ์ดํ„ฐ ํ™œ์šฉ ํšจ์œจ์„ฑ์— ๋Œ€ํ•œ ์˜๋ฌธ์ด ์ƒ๊น€

 

728x90
๋ฐ˜์‘ํ˜•