[Open AI] 프롬프트 엔지니어링

특강/생성형 AI를 활용한 리포팅 자동화(API)

[Open AI] 프롬프트 엔지니어링

코딩 아가 2025. 8. 28. 20:54

프롬프트 엔지니어링(prompt engineering)이란?

인공지능(AI) 모델, 특히 언어나 이미지 생성 모델에 특정한 출력을 생성하도록 유도하기 위해 입력 프롬프트를 세심하게 디자인하는 과정

질문의 의도와 목적을 정확하게 전달
포괄적 X, 추상적 X
예시를 주기
듣고싶은 대답의 형태와 길이를 제시하기
역할 부여하기

대시보드 안에서 보고서 생성 기능 구현하기

= AI DALL.E API

= 로고 생성 프로그램

Streamlit, Pandas, Matplotlib

1. 기본 대시보드 만들기

2. 보고서 생성하기

merge_and_rename: 함수를 활용해, 전 세계 평균과 국가별 데이터를 키(Year) 기준으로 병합
to_dict(orient="records"): JSON 형태의 리스트로 만들기
그래프 파일로 저장: 현재 세션에서 만든 fig1 ~ fig5를 PNG 파일로 저장
프롬프트 구성: report_data_str(JSON), 요구사항(섹션 구분, 마크다운 문법 등)을 문자열로 합쳐 최종 prompt_text 만들기
GPT 호출: askGpt(prompt_text)로 보고서 작성을 요청
결과 저장: 생성된 보고서(마크다운)를 파일(report.md)로 저장
파일 출력/확인 방법
- report.md 파일이 생성된 후, 로컬에서 열어보거나 다른 곳에 업로드하여 확인
- st.download_button: 웹 브라우저에서 직접 다운로드 링크를 제공

# 최종코드

import streamlit as st
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import openai
import json

def askGpt(prompt):
    """
    OpenAI API를 사용해 GPT 모델에 프롬프트를 전송하고,
    결과 응답(메시지)을 반환한다.
    """
    API_KEY = "sk-proj--A"  # 실제 사용 시 본인의 API 키로 대체
    client = openai.OpenAI(api_key=API_KEY)

    response = client.responses.create(
        model="gpt-4.1",
        input=prompt,
    )
    return response.output_text

def merge_and_rename(
    global_df: pd.DataFrame,
    country_df: pd.DataFrame,
    key_column: str,
    global_col_name: str,
    country_col_name: str,
    global_label: str,
    country_label: str,
):
    # 전처리 로직
    merged = (
        global_df.merge(
            country_df[[key_column, country_col_name]], on=key_column, how="left"
        )
        .fillna("")
        .rename(
            columns={
                f"{global_col_name}_x": global_label,
                f"{global_col_name}_y": country_label,
            }
        )
    )
    return merged.to_dict(orient="records")

def main():
    st.set_page_config(layout="wide")
    # (1) 데이터 불러오기
    df = pd.read_csv("global_health.csv")

    with st.sidebar:
        st.title("국가 선택")
        countries = sorted(df["Country"].unique())
        selected_country = st.selectbox("분석할 국가 선택", countries)

    st.title(f"{selected_country} 건강 지표 대시보드")

    country_data = df[df["Country"] == selected_country]

    global_data = df.copy()

    latest_year = int(global_data["Year"].max())
    global_latest = global_data[global_data["Year"] == latest_year]
    country_latest = country_data[country_data["Year"] == latest_year]

    st.header("1. Yearly Life Expectancy Comparison")
    global_life = global_data.groupby("Year")["Life_Expectancy"].mean().reset_index()

    fig1, ax1 = plt.subplots(figsize=(8, 5))
    ax1.plot(
        global_life["Year"],
        global_life["Life_Expectancy"],
        label="Global Average",
        linestyle="--",
    )
    ax1.plot(
        country_data["Year"],
        country_data["Life_Expectancy"],
        label=selected_country,
        marker="o",
    )
    ax1.set_xlabel("Year")
    ax1.set_ylabel("Life Expectancy")
    ax1.set_title("Trend of Life Expectancy")
    ax1.legend()
    st.pyplot(fig1)
    st.markdown(
        """
**설명:**
연도별 기대수명(Life Expectancy)을 전 세계 평균과 선택 국가로 구분하여 비교한 선 그래프입니다.
"""
    )

    st.header("2. Yearly Fertility Rate Comparison")
    global_fertility = (
        global_data.groupby("Year")["Fertility_Rate"].mean().reset_index()
    )

    fig2, ax2 = plt.subplots(figsize=(8, 5))
    ax2.plot(
        global_fertility["Year"],
        global_fertility["Fertility_Rate"],
        label="Global Average",
        linestyle="--",
    )
    ax2.plot(
        country_data["Year"],
        country_data["Fertility_Rate"],
        label=selected_country,
        marker="o",
    )
    ax2.set_xlabel("Year")
    ax2.set_ylabel("Fertility Rate")
    ax2.set_title("Trend of Fertility Rate")
    ax2.legend()
    st.pyplot(fig2)
    st.markdown(
        """
**설명:**
연도별 출산율(Fertility Rate)을 전 세계 평균과 선택 국가로 나누어 살펴볼 수 있는 그래프입니다.
"""
    )
    st.header("3. Latest Health Indicators Comparison")
    indicators = {
        "Life Expectancy": "Life_Expectancy",
        "Fertility Rate": "Fertility_Rate",
        "Urban Population (%)": "Urban_Population_Percent",
        "Obesity Rate (%)": "Obesity_Rate_Percent",
        "Underweight Rate (%)": "Underweight_Rate_Percent",
        "Overweight Rate (%)": "Overweight_Rate_Percent",
    }

    global_vals = {}
    country_vals = {}
    for key, col in indicators.items():
        global_vals[key] = global_latest[col].mean()
        country_vals[key] = (
            country_latest[col].mean() if not country_latest.empty else np.nan
        )

    compare_df = pd.DataFrame(
        {
            "Indicator": list(indicators.keys()),
            f"{selected_country}": list(country_vals.values()),
            "Global Average": list(global_vals.values()),
        }
    )

    fig3, ax3 = plt.subplots(figsize=(10, 6))
    bar_width = 0.35
    indices = np.arange(len(compare_df))

    ax3.bar(
        indices, compare_df[f"{selected_country}"], bar_width, label=selected_country
    )
    ax3.bar(
        indices + bar_width,
        compare_df["Global Average"],
        bar_width,
        label="Global Average",
    )
    ax3.set_xticks(indices + bar_width / 2)
    ax3.set_xticklabels(compare_df["Indicator"], rotation=45, ha="right")
    ax3.set_ylabel("Value")
    ax3.set_title("Health Indicators Comparison")
    ax3.legend()
    st.pyplot(fig3)
    st.markdown(
        f"""
**설명:**
최신 연도({latest_year}) 주요 건강 지표를 선택 국가와 전 세계 평균 값으로 비교한 막대 그래프입니다.
"""
    )
    st.header("4. Global Distribution: Urban Population vs Life Expectancy")
    fig4, ax4 = plt.subplots(figsize=(8, 5))
    ax4.scatter(
        global_latest["Urban_Population_Percent"],
        global_latest["Life_Expectancy"],
        label="Global Countries",
        alpha=0.6,
    )
    ax4.scatter(
        country_data["Urban_Population_Percent"],
        country_data["Life_Expectancy"],
        label=selected_country,
        s=80,
    )
    ax4.set_xlabel("Urban Population (%)")
    ax4.set_ylabel("Life Expectancy")
    ax4.set_title("Urban Population vs Life Expectancy")
    ax4.legend()
    st.pyplot(fig4)
    st.markdown(
        """
**설명:**
도시 인구 비율(Urban Population %)과 기대수명(Life Exp  ectancy)의 상관 관계를 확인할 수 있는 산점도입니다.
"""
    )
    st.header("5. Trend of Sanitary Expense Per Capita")
    global_sanitary = (
        global_data.groupby("Year")["Sanitary_Expense_Per_Capita"].mean().reset_index()
    )

    fig5, ax5 = plt.subplots(figsize=(8, 5))
    ax5.plot(
        global_sanitary["Year"],
        global_sanitary["Sanitary_Expense_Per_Capita"],
        label="Global Average",
        linestyle="--",
    )
    ax5.plot(
        country_data["Year"],
        country_data["Sanitary_Expense_Per_Capita"],
        label=selected_country,
        marker="o",
    )
    ax5.set_xlabel("Year")
    ax5.set_ylabel("Sanitary Expense Per Capita")
    ax5.set_title("Trend of Sanitary Expense Per Capita")
    ax5.legend()
    st.pyplot(fig5)
    st.markdown(
        """
**설명:**
1인당 위생 비용(Sanitary Expense)을 연도별로 살펴봄으로써, 보건 투자 수준과 건강 결과 간의 관련성을 파악할 수 있습니다.
"""
    )
    # 1) Life Expectancy 트렌드
    life_expectancy_trend = merge_and_rename(
        global_df=global_life,
        country_df=country_data,
        key_column="Year",
        global_col_name="Life_Expectancy",  # 전 세계 평균 컬럼
        country_col_name="Life_Expectancy",  # 해당 국가 컬럼
        global_label="Life_Expectancy_Global",
        country_label="Life_Expectancy_Country",
    )

    # 2) Fertility Rate 트렌드
    fertility_rate_trend = merge_and_rename(
        global_df=global_fertility,
        country_df=country_data,
        key_column="Year",
        global_col_name="Fertility_Rate",
        country_col_name="Fertility_Rate",
        global_label="Fertility_Rate_Global",
        country_label="Fertility_Rate_Country",
    )

    # 3) Sanitary Expense Per Capita 트렌드
    sanitary_expense_trend = merge_and_rename(
        global_df=global_sanitary,
        country_df=country_data,
        key_column="Year",
        global_col_name="Sanitary_Expense_Per_Capita",
        country_col_name="Sanitary_Expense_Per_Capita",
        global_label="Sanitary_Expense_Global",
        country_label="Sanitary_Expense_Country",
    )

    # 4) 최신 지표 비교 테이블
    latest_indicators = compare_df.to_dict(orient="records")

    # 5) Urban Population vs Life Expectancy (Global)
    urban_life_scatter_global = global_latest[
        ["Country", "Urban_Population_Percent", "Life_Expectancy"]
    ].to_dict(orient="records")

    # 6) Urban Population vs Life Expectancy (Country)
    urban_life_scatter_country = country_data[
        ["Year", "Urban_Population_Percent", "Life_Expectancy"]
    ].to_dict(orient="records")

    # 최종 report_data 구성
    report_data = {
        "country": selected_country,
        "latest_year": latest_year,
        "chart_data": {
            "life_expectancy_trend": life_expectancy_trend,
            "fertility_rate_trend": fertility_rate_trend,
            "sanitary_expense_trend": sanitary_expense_trend,
            "latest_indicators": latest_indicators,
            "urban_life_scatter_global": urban_life_scatter_global,
            "urban_life_scatter_country": urban_life_scatter_country,
        },
    }

    st.json(report_data)
    # JSON 문자열로 변환하여 세션에 저장
    report_data_str = json.dumps(report_data, indent=2, default=str)
    st.json(report_data_str)

    with st.sidebar:

        # (2) '보고서 생성' 버튼
        if st.button("보고서 생성"):
            # (A) 그래프들 파일로 저장
            figs = [fig1, fig2, fig3, fig4, fig5]
            for i, fig in enumerate(figs, start=1):
                fig.savefig(f"graph{i}.png")

            prompt_text = f"""
다음은 {selected_country} 국가의 건강 지표 데이터와 전 세계 평균 데이터를 비교한 결과입니다.
이 데이터를 바탕으로 마크다운 형식의 보고서를 작성해 주세요.

# 보고서 생성에 필요한 원본 데이터 (JSON 형태)
{report_data_str}

# 마크다운 보고서 작성 요구사항
- 문서 전반을 한글로 작성
- 5개의 그래프를 각각 섹션(챕터)로 구분
- 각 섹션마다 데이터 인사이트(해석, 원인 추정, 의미, 향후 전망 등)를 자세히 기술
- 각 지표별로 중요한 변동 추세나 비교 결과에 대한 인사이트를 작성할 것
- 각 지표별로 전반적인 건강 상태 및 시사점을 요약해 줄 것
- 마지막에 전체 요약 및 제언을 달아 주세요.
- 표 형태, 볼드체 등 마크다운 문법을 적극적으로 활용해서 내용을 채워주세요.
- 각 시각화(그래프)가 들어갈 위치에 실제 파일명을 참고해주세요:
  [그래프1](graph1.png)
  [그래프2](graph2.png)
  [그래프3](graph3.png)
  [그래프4](graph4.png)
  [그래프5](graph5.png)
  형태로 마크다운 이미지를 삽입해 주세요.
- 각 그래프에 대한 정보는 아래와 같습니다. 분석 결과 보고서에 아래 정보를 활용해서 적절한 위치에 마크다운 이미지로 삽입해 주세요요
  - 그래프1은  1. Yearly Life Expectancy Comparison 으로 연도별 기대수명(Life Expectancy)을 전 세계 평균과 선택 국가로 구분하여 비교한 선 그래프입니다.
  - 그래프2은  2. Yearly Fertility Rate Comparison 으로로 연도별 출산율(Fertility Rate)을 전 세계 평균과 선택 국가로 나누어 살펴볼 수 있는 그래프입니다.
  - 그래프3은  3. Latest Health Indicators Comparison 으로 최신 연도(2021) 주요 건강 지표를 선택 국가와 전 세계 평균 값으로 비교한 막대 그래프입니다.
  - 그래프4은  4. Global Distribution: Urban Population vs Life Expectancy 으로 도시 인구 비율(Urban Population %)과 기대수명(Life Expectancy)의 상관 관계를 확인할 수 있는 산점도입니다.
  - 그래프5은  5. Trend of Sanitary Expense Per Capita 으로 1인당 위생 비용(Sanitary Expense)을 연도별로 살펴봄으로써, 보건 투자 수준과 건강 결과 간의 관련성을 파악할 수 있습니다.
"""

            # GPT에게 보고서 생성 요청
            generated_report = askGpt(prompt_text)
            print(generated_report)

            # 생성된 보고서를 파일로 저장
            with open("report.md", "w", encoding="utf-8") as f:
                f.write(generated_report)

if __name__ == "__main__":
    main()

'특강 > 생성형 AI를 활용한 리포팅 자동화(API)' 카테고리의 다른 글

[Open AI]프로그램 제작 (08.26) (1)	2025.08.26
[Open AI] 생성형 AI API (08.21) (2)	2025.08.26

현재글[Open AI] 프롬프트 엔지니어링

코딩 아가의 성장과정

QAQC분야 데이터 분석가로 취업하기 위한 한걸음

코드카타, 태블로, 챌린지, 내일배움캠프, xgboost, 랜덤포레스트, SQL, 시계열데이터, 코딩, 머신러닝, ChatGPT, 상관관계, 아티클스터디, tableau, 파이썬, Til, Python, python3, 데이터분석, 테블로,

Today :
Yesterday :

일	월	화	수	목	금	토
	1	2	3	4	5	6
7	8	9	10	11	12	13
14	15	16	17	18	19	20
21	22	23	24	25	26	27
28	29	30

코딩 아가의 성장과정