Getting Started with Open API (3)

728x90

Converting Data from Open API to Excel on data.go.kr
행정안전부_통계연보_인구 규모별 행정구역

1. 공공데이터포털에 오픈API로 제공된 "행정안전부 - 통계연보 인구 규모별 행정구역" 자료를 구글코랩 파이썬 코드를 통해 파일데이터로 변환하기 위하여 "활용신청"을 하여 URL과 서비스 인증키를 발급받았습니다.

2. 기술문서 다운받아 요청 및 응답 메세지 확인

기술문서명 : 행정안전부 행정·안전 공공데이터 Open API 활용가이드
1. 서비스 명세 > 다. 상세기능 내역 > b. 요청 메시지 명세 & c. 응답 메시지 명세

3. 구글코랩 실행을 위한 API URL은 "Call Back URL"칸을 활용해야 합니다.

1. 서비스 명세 > 1.1 공공데이터 API 서비스 > 다. 상세기능 내역
* End Point의 URL와 상이함

결과물:

코드:

import pandas as pd

import requests

from google.colab import files

# API URL and key

api_url = "http://apis.data.go.kr/1741000/AdministrativeDistStatPopSize/getAdministrativeDistStatPopSizeList"

service_key = "kGUKTo8Mh/FFlsfrtw7HOGgfkKwZkXH8TNNzDcOdXpdg4I5RrbSu89yzFp9PLET6xzVDhwQU5VyRCUKIoG2YQg=="

# Function to fetch data from API

def fetch_data(year, page_no, num_of_rows):

params = {

'ServiceKey': service_key,

'type': 'xml',

'pageNo': page_no,

'numOfRows': num_of_rows,

'bas_yy': year

}

response = requests.get(api_url, params=params)

response.raise_for_status() # Check for request errors

return response.content

# Parse XML and extract data

def parse_xml(xml_data):

root = ET.fromstring(xml_data)

data = []

for row in root.findall('.//row'):

record = {}

for elem in row:

record[elem.tag] = elem.text

data.append(record)

return data

# Loop through years and pages to fetch all data

all_data = []

years = range(2010, 2020) # Change range as needed

num_of_rows = 1000

for year in years:

page_no = 1

while True:

xml_data = fetch_data(year, page_no, num_of_rows)

data = parse_xml(xml_data)

if not data:

break # No more data to fetch

all_data.extend(data)

page_no += 1

# Create DataFrame and save to Excel

df = pd.DataFrame(all_data)

excel_file = 'AdministrativeDistStatPopSize_all_years.xlsx'

df.to_excel(excel_file, index=False)

# Download the Excel file

files.download(excel_file)

4. 만약 항목명을 국문으로 설정하고 싶다면 아래와 같이 코드를 작성하셔도 되십니다.

import pandas as pd

import requests

import xml.etree.ElementTree as ET

from google.colab import files

# API URL and key

api_url = "http://apis.data.go.kr/1741000/AdministrativeDistStatPopSize/getAdministrativeDistStatPopSizeList"

service_key = "kGUKTo8Mh/FFlsfrtw7HOGgfkKwZkXH8TNNzDcOdXpdg4I5RrbSu89yzFp9PLET6xzVDhwQU5VyRCUKIoG2YQg=="

# Column name mapping from English to Korean

column_mapping = {

"totalCount": "전체 결과 수",

"numOfRows": "한 페이지결과 수",

"pageNo": "페이지 번호",

"type": "수신 문서형식",

"resultCode": "결과코드",

"resultMsg": "결과메세지",

"bas_yy": "기준년도",

"city_smry": "시 계",

"city_tths50_mor": "시 50만이상",

"city_tths30_ut_tths50": "시 30만이상 50만미만",

"city_tths10_ut_tths30": "시 10만이상 30만미만",

"city_tths10_lss": "시 10만미만",

"cnti_smry": "군 계",

"cnti_tths10_mor": "군 10만이상",

"cnti_tths5_ut_tths10": "군 5만이상 10만미만",

"cnti_tths3_ut_tths5": "군 3만이상 5만미만",

"cnti_tths3_lss": "군 3만미만",

"atodstri_smry": "자치구 계",

"atodstri_tths50_mor": "자치구 50만이상",

"atodstri_tths30_ut_tths50": "자치구 30만이상 50만미만",

"atodstri_tths30_lss": "자치구 30만미만",

"eup_smry": "읍 계",

"eup_tths3_mor": "읍 3만이상",

"eup_tths2_ut_tths3": "읍 2만이상 3만미만",

"eup_tths1_ut_tths2": "읍 1만이상 2만미만",

"eup_tths1_lss": "읍 1만미만",

"myeon_smry": "면 계",

"myeon_tths2_mor": "면 2만이상",

"myeon_tths1_ut_tths2": "면 1만이상 2만미만",

"myeon_ths5_ut_tths1": "면 5천이상 1만미만",

"myeon_ths5_lss": "면 5천미만",

"dong_smry": "동 계",

"dong_tths3_mor": "동 3만이상",

"dong_tths2_ut_tths3": "동 2만이상 3만미만",

"dong_tths1_ut_tths2": "동 1만이상 2만미만",

"dong_ths5_ut_tths1": "동 5천이상 1만미만",

"dong_ths5_lss": "동 5천미만"

}

# Function to fetch data from API

def fetch_data(year, page_no, num_of_rows):

params = {

'ServiceKey': service_key,

'type': 'xml',

'pageNo': page_no,

'numOfRows': num_of_rows,

'bas_yy': year

}

response = requests.get(api_url, params=params)

response.raise_for_status() # Check for request errors

return response.content

# Parse XML and extract data

def parse_xml(xml_data):

root = ET.fromstring(xml_data)

data = []

for row in root.findall('.//row'):

record = {}

for elem in row:

record[elem.tag] = elem.text

data.append(record)

return data

# Loop through years and pages to fetch all data

all_data = []

years = range(2010, 2020) # Adjust the range as needed

num_of_rows = 1000

for year in years:

page_no = 1

while True:

xml_data = fetch_data(year, page_no, num_of_rows)

data = parse_xml(xml_data)

if not data:

break # No more data to fetch

all_data.extend(data)

page_no += 1

# Create DataFrame and rename columns to Korean

df = pd.DataFrame(all_data)

df.rename(columns=column_mapping, inplace=True)

# Save to Excel

excel_file = 'AdministrativeDistStatPopSize_all_years_kr.xlsx'

df.to_excel(excel_file, index=False)

# Download the Excel file

files.download(excel_file)

결과물:

* 오류 관련

해당 코드에서는 End Point를 사용하니 오류가 발생하는 것을 보아, call back URL 사용하여 코드 사용하시는 것을 추천드립니다.

728x90

'▪ Research' 카테고리의 다른 글

[OECD Topics] Development (0)	2024.08.03
[OECD Topics] Digital (0)	2024.08.03
Getting Started with Open API (2) (0)	2024.08.01
ODA 사업평가 환류체계: 평가와 성과 사이 (0)	2024.07.24
Getting Started with Open API (1) (5)	2024.07.22

Soylog 🌱

Getting Started with Open API (3)

'▪ Research' 카테고리의 다른 글

티스토리툴바

Getting Started with Open API (3)

'▪ Research' 카테고리의 다른 글

관련글

티스토리툴바