데이터 분석 관련 필수 라이브러리

1. 판다스 pandas

2. 넘파이 numpy

3. 맷플롯립 matplotlib

4. 사이파이 scipy

5. 사이킷런 scikit-learn

0. 판다스 pandas: 데이터 분석 도구

데이터 수집, 정리에 최적화된 도구
오픈소스
파이썬 기반 배우기 쉬움
데이터과학의 80~90% 업무를 처리할 수 있음
(1차목적) 서로 다른 여러 가지 유형의 데이터를 공통의 포맷으로 정리하는 것
판다스는 여러 종류의 클래스와 다양한 내장 함수로 구성 - Series(), DataFrame() read_csv(), read_excel() 등

1. 자료구조

시리즈 - 1차원 배열

딕셔너리와 구조가 비슷함 : 딕셔너리를 시리즈로 변환하는 방법을 많이 사용 pd.Series(dict_data)

판다스 내장함수 Series()

딕셔너리를 함수의 매개변수로 전달, 딕셔너리의 키 - 시리즈의 인덱스 / 딕셔너리의 값 - 시리즈의 값(원소)

데이터프레임 - 2차원 배열 (실무에서 가장 많이 사용)

행과 열로 구성

# 라이브러리 불러오기
import pandas as pd

# 행 인덱스와 열 이름 설정하여 데이터프레임 생성
df = pd.DataFrame([["정대만", 19, 184], 
                   ["서태웅", 17, 187]], 
                  index = ['선수1', '선수2'], # 행 인덱스
                  columns = ['이름', '나이', '키']) # 열 이름
# 행 인덱스 확인
print(df.index)
# 열 이름 확인
print(df.columns)

이름 변경 df.rename

df.rename(index = {'원래인덱스명':'바꾸려는인덱스명'}) 새로운 df 정의

df.rename(index = {'원래인덱스명':'바꾸려는인덱스명'}, inplace = True) 원본을 바로 변경 (계속 적용가능, 같은 공식)

행/열 삭제

df.drop('행 인덱스명', axis = 0) 행 삭제

df.drop('열 인덱스명', axis = 1) 열 삭제

df.drop(['인덱스명','인덱스명'], axis = 0) 여러개 삭제

행/열 선택

df.loc['행/열 인덱스명'] 시리즈 1차원

df.loc[['행/열 인덱스명']] 데이터프레임 2차원

df.loc[['행/열 인덱스명','행/열 인덱스명']] 여러개 행 선택

df.iloc[0] # 인덱스로 지정 iloc

df.iloc[[0:2]] # 슬라이싱 범위 지정 가능

행과 열 동시 선택

df.loc['행 인덱스','열 인덱스']

df.iloc[0,0]

df.loc[['행 인덱스1','행 인덱스2'],'열 인덱스']

행/열 추가

df['새 행/열 인덱스'] = 값 or 리스트

원소값 변경 : 행, 열 동시 선택 후 값 새로 지정

df.loc['행 인덱스','열 인덱스'] = 값

df.iloc[0,0] = 값

행과 열 위치 바꾸기

df.T

df.transpose()

# 메소드 함수 = 클래스 안에 있는 함수
# 행 인덱스, 열 이름 => 이름 변경
# rename 메소드 함수의 반환값은 새로운 데이터프레임
df_new = df.rename(index = {'학생1': '선수1', '학생2': '선수2'})
# 원본을 바로 변경
df.rename(columns = {'성명': '이름', '연령': '나이', '신장': '키'}, inplace = True)

# 행/열 삭제
df1 = df.drop('성명', axis = 0)
df2 = df.drop('연령', axis = 1)

# 행 선택
df.loc['성명'] # 시리즈
df.iloc[['성명']] # 데이터프레임
df.iloc[0] # 시리즈
df.iloc[[0:2]] # 데이터프레임

# 열 선택 (생략)
# 행과 열 동시에 선택
df.loc['선수1','이름']
df.iloc[0,0]
df.loc[['선수1','선수2'],'이름']

# 새 열 추가
df['몸무게'] = [50,60]
# 새 행 추가
df['선수3'] = 0 # 전부 0으로 같은 값

# 원소값 변경
df.loc['선수1','이름'] = 홍길동
df.iloc[0,0] = 박혁거세

# 행과 열 위치 바꾸기
df.T
df.transpose()

2. 인덱스 활용

인덱스 지정

df.set_index('칼럼(지정할 인덱스)명')

인덱스 재배열

df.reindex([새 인덱스명들])

새로운 행 인덱스를 추가하면 해당 값은 NaN으로 채워짐, fill_value로 빈 값을 지정 가능

대신 inplace = True 원본 변경 지원 안함!

# 인덱스 설정
df.set_index('이름')
df.set_index(['이름','나이'])

# 인덱스 재배열
# 새로운 행 인덱스를 추가하면 해당 값은 NaN 로 채워짐
df2 = df.reindex(['r0','r1','r2','r3'])
# NaN 대신 0으로 채우기
df2 = df.reindex(['r0','r1','r2','r3'], fill_value = 0)

인덱스 초기화

df.reset_index()

행 인덱스 기준 정렬

df.sort_index(ascending=False)

열 인덱스 기준 정렬

df.sort_values('c1',ascending=False)

True 오름차순 False 내림차순

# 행 인덱스 초기화
df.reset_index()

# 행 인덱스 기준 정렬
df.sort_index(ascending=False)

# 열 인덱스 기준 정렬
df.sort_values('c1', ascending = False)

3. 산술 연산

시리즈 + 10 => 시리즈

import pandas as pd

# 딕셔너리를 시리즈로 변환
student1 = pd.Series({"국어": 80,
                      "영어": 50,
                      "수학": 90})
print(student1)
# 딕셔너리의 키 = 시리즈의 인덱스
# 딕셔너리의 값 = 시리즈의 값

student1_edit = student1 + 10
print(type(student1_edit)) # 시리즈 객체로 반환

시리즈와 시리즈 산술연산

# 딕셔너리를 시리즈로 변환
student1 = pd.Series({"국어": 80,
                      "영어": 50,
                      "수학": 90})
student2 = pd.Series({"영어": 90,
                      "수학": 20,
                      "국어": 90})
print(student1);print(student2)

# 사칙연산
plus = student1 + student2
minus = student1 - student2
mul = student1 * student2
div = student1 / student2

# 데이터프레임
df = pd.DataFrame([plus, minus, mul, div],
                  index = ['더하기','빼기','곱하기','나누기'])
print(df)

대응되는 인덱스가 있는데 그 값이 NaN인 경우

대응되는 인덱스가 없는 경우

=> NaN으로 출력된다

=> NaN으로 바뀌는 것을 막으려면, fill_value 사용하기

import numpy as np
# 딕셔너리를 시리즈로 변환
student1 = pd.Series({"국어": np.nan,
                      "영어": 50,
                      "수학": 90})
student2 = pd.Series({"영어": 90,
                      "국어": 80})
                      
# 사칙연산 => 연산 메소드
plus = student1.add(student2, fill_value = 0)
minus = student1.sub(student2, fill_value = 0) 
mul = student1.mul(student2, fill_value = 0)
div = student1.div(student2, fill_value = 0)
# 데이터프레임
df = pd.DataFrame([plus, minus, mul, div],
                  index = ['더하기','빼기','곱하기','나누기'])
print(df)
# 계산결과 국어 앞에 있는 값이 0, 수학 뒤에 있는 값 0

데이터 프레임와 데이터프레임 산술연산도 동일

4. 데이터 입출력

외부파일 불러오기

pd.read_csv('파일경로', header = 0)

pd.read_csv('파일경로', index_col = None)

pd.read_excel('파일경로', engine = 'openpyxl', header = 0)

pd.read_excel('파일경로', engine = 'openpyxl', index_col = None)

df = pd.read_csv('/content/drive/MyDrive/hana1/data/read_csv_sample.csv', header = None)
df1 = pd.read_csv('/content/drive/MyDrive/hana1/data/read_csv_sample.csv', index_col = 'c0')

df2 = pd.read_excel('/content/drive/MyDrive/hana1/data/남북한발전전력량.xlsx', 
                    engine = 'openpyxl', header = 0)
df3 = pd.read_excel('/content/drive/MyDrive/hana1/data/남북한발전전력량.xlsx', 
                    engine = 'openpyxl', index_col = '1990')

저장하기

df.to_csv('파일경로')

df.to_excel('파일경로')

df.to_json('파일경로')

# 저장 (기본경로 = ./ = /content/)
df.to_csv('./df_csv.csv')
df.to_excel('./df_excel.xlsx')
df.to_json('./df_json.json')

# 엑셀 파일 = 워크 북
# 시트 = 워크 시트
wb = pd.ExcelWriter('./wb_excel.xlsx')
df.to_excel(wb, sheet_name = 'df1')
df1.to_excel(wb, sheet_name = 'df2')
wb.save()

웹에서 가져오기

html

웹 스크래핑

# html
url = '/content/drive/MyDrive/hana1/data/sample.html'
df = pd.read_html(url)
# print(df)
print(len(df)) # 2개의 테이블
for i in range(len(df)):
  print("table {}".format(i+1))
  print(df[i])

# 웹 스크래핑
# 라이브러리 불러오기
from bs4 import BeautifulSoup
import requests
import re
import pandas as pd

url = 'https://en.wikipedia.org/w/index.php?title=List_of_American_exchange-traded_funds&oldid=948664741'
resp = requests.get(url)
soup = BeautifulSoup(resp.text, 'lxml')
rows = soup.select('div > ul > li')
etf = {}
for row in rows:
  # 주식 상품 이름, 주식 시장 이름, 티커(주식 상품 이름 약어)
  # 예) iShares Russell 1000 Growth Index (NYSE Arca|IWF)
  # ^ = 문자열의 맨 처음과 매치 <=> $ = 문자열의 맨 끝과 매치
  # . = 줄 바꿈 \n 를 제외하고 모든 문자열과 매치
  # * = 앞에 있는 문자가 0부터 무한대로 매치
  # findall = 정규표현식으로 매치되는 모든 문자열을 리스트 형태로 반환
  etf_name = re.findall("^(.*) \(NYSE", row.text) # 처음부터  (NYSE 의 문자열
  etf_market = re.findall("\((.*)\|", row.text) # ( ~ | 의 문자열
  etf_ticker = re.findall("\|(.*)\)", row.text) # ( ~ | 의 문자열
  # 세 개의 데이터가 모두 있는 경우
  if (len(etf_name) > 0) & (len(etf_market) > 0) & (len(etf_ticker) > 0):
    etf[etf_ticker[0]] = [etf_market[0], etf_name[0]]
df = pd.DataFrame(etf)
print(df)


import re
string = "Vanguard S&P Small-Cap 600 (NYSE Arca|VIOO)"
# 역슬래시 바로 뒤에 문자가 있어야 함
print(re.findall('^(.*)\(NYSE', string))
# print(re.findall('^(.*)\ (NYSE', string))
print(re.findall('^(.*) \(NYSE', string))
# blank = \s
print(re.findall('^(.*)\s\(NYSE', string))
print(re.findall('\((.*)\|', string))
print(re.findall('\|(.*)\)', string))

API 이용한 데이터 수집

# 구글 맵스 설치
!pip install googlemaps
# 메뉴 - 런타임 - 런타임 다시 시작 선택

# 라이브러리 불러오기
import googlemaps
import pandas as pd

api_key = '' # 본인이 발급받은 키를 입력해서 사용할 것!

# 구글 맵스 객체 생성
maps = googlemaps.Client(key = api_key)

# 장소 리스트를 이용하여 지오코딩하는 프로그램
places = ['해운대해수욕장','광안리해수욕장','송정해수욕장','정동진역']
lat = []
lng = []
i = 0
for place in places:
  i += 1
  print(i, place)
  geo_coding = maps.geocode(place)[0].get('geometry')
  lat.append(geo_coding['location']['lat'])
  lng.append(geo_coding['location']['lng'])
df = pd.DataFrame({'위도': lat,
                   '경도': lng})

# 주소를 이용하여 지오코딩
maps.geocode("서울특별시 성동구 아차산로 111 2층")[0].get('geometry')

5. 데이터 살펴보기

데이터 미리보기

df.head() 앞 10개 행

df.tail() 뒤 10개 행

데이터 요약 정보 확인

df.shape 크기

df.info() 기본 정보

df.dtypes 자료형 확인

df[열].dtypes

df.describe() 기술통계 정보 요약

df.describe(includ='all')

데이터 개수 확인

df.count()

고유값 개수 = 빈도가 높은 순으로 정렬

df.origin.value_counts()

6. 통계함수

df.mean() 평균

df.median() 중앙값

df.max() 최대값

df.min() 최솟값

df.std() 표준편차

df.var() 분산

df.corr() 상관계수

7. 그래프 도구

df.plot() 선 그래프

df.plot(kind = 'bar') 막대 그래프

df.plot(kind = 'barh') 수평 막대 그래프

df.plot(kind = 'hist') 히스토그램

df.plot(kind = 'scatter', x = , y = ) 산점도

df[열 2개].plot(kind = 'box') 상자수염그림

8. 시각화 도구

matplotlib

# 한글 폰트 설치
!sudo apt-get install -y fonts-nanum
!sudo fc-cache -fv
!rm ~/.cache/matplotlib -rf
# 메뉴 - 런타임 - 런타임 다시 시작

# 라이브러리 불러오기
import pandas as pd
import matplotlib.pyplot as plt

한글 폰트 지정

plt.rc("font", family = "NanumGothic")

차트 제목, 축 제목 추가

plt.title("")

plt.xlabel("")

plt.ylabel("")

출력

plt.show()

# 데이터 불러오기
df = pd.read_excel('/content/drive/MyDrive/hana1/data/시도별 전출입 인구수.xlsx',
                   engine = 'openpyxl')
# NaN 값 채우기
df = df.fillna(method = 'ffill')
# 전출지별 열 삭제
df_seoul = df_seoul.drop('전출지별', axis = 1)
# 열 이름 변경
df_seoul = df_seoul.rename({'전입지별':'전입지'}, axis = 1)
# 행 인덱스 변경
df_seoul = df_seoul.set_index('전입지')

# 시리즈 데이터를 넣고 시각화해도 결과는 동일함
plt.plot(df_ggd)

# 차트 제목, 축 제목 추가
# 한글 폰트 지정
plt.rc("font", family = "NanumGothic")
# 선 그래프
plt.plot(df_ggd.index, df_ggd.values)
# 차트 제목 추가
plt.title("서울시에서 경기도로 이동하는 인구")
# 축 제목 추가
plt.xlabel("기간(연도)")
plt.ylabel("이동 인구 수")
plt.show()

그래프 사이즈

plt.figure(figsize=(가로,세로))

x축 눈금 라벨 회정

plt.xticks(rotation = 'vertical')

범례 추가

plt.legend(labels=["범례명"], loc = 'best')

# 그래프 꾸미기
# 그림 사이즈 지정(가로, 세로)
plt.figure(figsize = (13, 5))
# x 축 눈금 라벨 회전
plt.xticks(rotation = 'vertical')
# 선 그래프
plt.plot(df_ggd.index, df_ggd.values)
# 차트 제목 추가
plt.title("서울시에서 경기도로 이동하는 인구")
# 축 제목 추가
plt.xlabel("기간(연도)")
plt.ylabel("이동 인구 수")
# 범례 추가
plt.legend(labels = ["서울시 => 경기도"], loc = 'best', fontsize = 15)
plt.show()

스타일 서식 지정

plt.style.used('스타일서식명')

y축 범위 조정

plt.ylim(50000, 800000)

화살표 그리기

plt.annotate("텍스트 ", xy = (화살표머리 위치 x,y), xytest = (화살표꼬리 위치 x,y), xycords = 'data',

arrowprops = dict(arrowstyle = '->', color = 'red', lw = 10)

텍스트 그리기

plt.annotate("텍스트 ", xy = (텍스트머리 위치 x,y), rotation = 회전각도, va 위아래정렬 = 'baseline', ha 좌우정렬 = 'center',

fontsize = 폰트사이즈)

# 스타일 서식 확인
plt.style.available

# 스타일 서식
plt.style.use('ggplot')

# y 축 범위 조정
plt.ylim(50000, 800000)
# 주석 추가
# 화살표
plt.annotate("", xy = (23, 620000), # 화살표 머리
             xytext = (3, 250000), # 화살표 꼬리
             xycoords = 'data', # 좌표계
             arrowprops = dict(arrowstyle = '->', color = 'red', lw = 10))
plt.annotate("", xy = (45, 400000), # 화살표 머리
             xytext = (29, 620000), # 화살표 꼬리
             xycoords = 'data', # 좌표계
             arrowprops = dict(arrowstyle = '->', color = 'blue', lw = 10))
# 텍스트
plt.annotate("인구 이동 증가",
             xy = (12, 400000), # 텍스트 시작 위치
             rotation = 25, # 회전
             va = 'baseline', # 위아래 정렬
             ha = 'center', # 좌우 정렬
             fontsize = 15)
plt.annotate("인구 이동 감소",
             xy = (37, 500000), # 텍스트 시작 위치
             rotation = -18, # 회전
             va = 'baseline', # 위아래 정렬
             ha = 'center', # 좌우 정렬
             fontsize = 15)
plt.show()

화면 분할

1. 연도 리스트 생성

col_year = 1970년부터 2018까지 문자열 리스트

2. 그래프 객체 만들기 2*2 행렬

ax1 = fig.add_subplot(2,2,1)

ax2 = fig.add_subplot(2,2,2)

ax3 = fig.add_subplot(2,2,3)

ax4 = fig.add_subplot(2,2,4)

ax 객체에 그래프 추가하기 ax1.plot(col_years,~~~~~)

눈금라벨, 범위조정 set_xticklabel, set_ylim

# 화면 분할
df_4 = df_seoul.loc[['충청남도','경상북도','강원도','전라남도']]

# 연도 리스트 생성
# 1970 ~ 2018 문자열 리스트
col_years = list(map(str, range(1970,2018)))

# 한글 폰트 지정
plt.rc("font", family = "NanumGothic")
# 스타일 서식
plt.style.use('ggplot')
# 그래프 객체 만들기
fig = plt.figure(figsize = (20, 10))
ax1 = fig.add_subplot(2, 2, 1) # 행의 개수, 열의 개수, 위치
ax2 = fig.add_subplot(2, 2, 2)
ax3 = fig.add_subplot(2, 2, 3)
ax4 = fig.add_subplot(2, 2, 4)
# axe 객체에 그래프 추가
ax1.plot(col_years, df_4.loc['충청남도', :],  marker = 'o', markerfacecolor = 'green',
         markersize = 5, color = 'green', label = "서울 => 충남")
ax2.plot(col_years, df_4.loc['경상북도', :],  marker = 'o', markerfacecolor = 'red',
         markersize = 5, color = 'red', label = "서울 => 경북")
ax3.plot(col_years, df_4.loc['강원도', :],  marker = 'o', markerfacecolor = 'yellow',
         markersize = 5, color = 'yellow', label = "서울 => 강원")
ax4.plot(col_years, df_4.loc['전라남도', :],  marker = 'o', markerfacecolor = 'blue',
         markersize = 5, color = 'blue', label = "서울 => 전남")
# 범례 추가
ax1.legend(loc = 'best')
ax2.legend(loc = 'best')
ax3.legend(loc = 'best')
ax4.legend(loc = 'best')
# x 축 눈금 라벨 회전
ax1.set_xticklabels(col_years, rotation = 90)
ax2.set_xticklabels(col_years, rotation = 90)
ax3.set_xticklabels(col_years, rotation = 90)
ax4.set_xticklabels(col_years, rotation = 90)
# y 축 범위 조정
ax1.set_ylim(0, 60000)
ax2.set_ylim(0, 60000)
ax3.set_ylim(0, 60000)
ax4.set_ylim(0, 60000)
plt.show()

색상과 헥스 코드

# 색상과 헥스 코드 확인
import matplotlib

colors = {}
for i, j in matplotlib.colors.cnames.items():
  colors[i] = j
print(colors)

(누적) 면적 그래프

df.plot(kind = 'area', stacked = False누적여부, alpha = 투명도, figsize =(x,y))

# kind 매개변수 기본값은 선
df_4_t.plot(kind = 'area',
            stacked = False, # 누적 여부
            alpha = 0.2, # 투명도
            figsize = (7, 5))
# 차트 제목
plt.title("서울시에서 다른 4개 지역으로 이동한 인구")
# 축 제목(이름)
plt.xlabel("기간(연도)", size = 10)
plt.ylabel("이동 인구 수", size = 10)
# 
plt.show()

수직 막대 그래프

df.plot(kind = 'bar', width = 넓이, figsize =(x,y), color = [각 막대 색깔 지정])

# 스타일 서식 적용
plt.style.use('ggplot')
# 수직 막대 그래프
df_4_t.plot(kind = 'bar',
          figsize = (12,5),
          width = 0.5,
          color = ['green','red','yellow','blue'])
# 차트 제목
plt.title("서울시에서 다른 4개 지역으로 이동한 인구")
# 축 제목(이름)
plt.xlabel("기간(연도)", size = 10)
plt.ylabel("이동 인구 수", size = 10)

plt.show()

bar 막대 정렬

수평 막대 그래프

df.plot(kind = 'barh', width = 넓이, figsize =(x,y), color = 'blue')

# 인구 합계 내림차순 정렬
df_total = df_total[["합계"]].sort_values(by = '합계', ascending = False)
# 인구 합계 오름차순 정렬
df_total = df_total[["합계"]].sort_values(by = '합계')

# 수평 막대 그래프
df_total.plot(kind = 'barh',
                  figsize = (12,5),
                  width = 0.5,
                  color = 'blue')

보조축(2축 그래프)

# 마이너스 기호 출력
plt.rcParams['axes.unicode_minus'] = False

# 수력, 화력 발전에 대한 누적 막대 그래프
ax1 = df_north[['수력', '화력']].plot(kind = 'bar', stacked = True, figsize = (12,6))
ax2 = ax1.twinx() # x 축 공유

# 증감율에 대한 선 그래프
ax2.plot(df_north.index, df_north['증감율'], color = 'green', marker = 'o', label = '증감율')

히스토그램

df.plot(kind = 'hist', bins = 구간개수, color = 'blue', figsize =(x,y))

산점도

df.plot(kind = 'sactter', x = '인덱스명' , y ='칼럼명', s=크기, figsize =(x,y))

버블차트(원의 색상 = 숫자형 변수)

df.plot(kind = 'sactter', x = '인덱스명' , y ='칼럼명', c = '칼럼명', s=크기, alpha = 투명도, figsize =(x,y))

버블차트(원의 색상 = 범주형 변수)

df.plot(kind = 'sactter', x = '인덱스명' , y ='칼럼명', c = '칼럼명', cmap = '컬러맵', s=크기, alpha = 투명도, figsize =(x,y))

# 히스토그램
df['mpg'].plot(kind = 'hist', 
               bins = 15, # 구간 개수
               color = 'aquamarine',
               figsize = (5,3))

# 산점도
df.plot(kind = 'scatter',
        x = 'mpg',
        y = 'weight',
        c = 'tomato', # 색상
        s = 10, # 크기
        figsize = (6,3))
        
# 버블차트
df.plot(kind = 'scatter',
        x = 'mpg',
        y = 'weight',
        c = 'tomato', # 색상
        s = 'displacement', # 크기
        alpha = 0.2,
        figsize = (8,5))

파이차트

df.plot(kind = 'pie', autopct = '소숫점 자리수' , color = [색깔 리스트], startangle = 0, figsize =(x,y))

df_origin['count'].plot(kind = 'pie',
                        figsize = (10, 5),
                        # autopct = '%.1f', # 소수점 첫째자리 숫자만 표시
                        autopct = '%.1f%%', # 소수점 첫째자리로 % 와 함께 표시
                        colors = ['red','blue','yellow'], # 각 조각의 색상
                        startangle = 0) # 시작 위치

상자 수염 그림

그림 객체 생성 후 add_subplot(1,1,1)

ax.boxplot(x,labels)

vert = True (디폴트, 수직) / False (수평)

# 그림 객체
fig = plt.figure(figsize = (10, 5))
ax = fig.add_subplot(1,1,1)

# 수직 상자 그림
ax.boxplot(x = [mpg_1, mpg_2, mpg_3],
           labels = ['USA', 'EU', 'JAP'])
plt.show()

# 수평 상자 그림
ax.boxplot(x = [mpg_1, mpg_2, mpg_3],
           labels = ['USA', 'EU', 'JAP'],
           vert = False)
plt.show()

컬러맵

plt.colormaps()

'Daily > 디지털하나로' 카테고리의 다른 글

시각화 도구 seaborn (0)	2023.06.15
SQL 이론&실습 (2) (0)	2023.06.13
SQL 이론&실습 (1)	2023.06.13
데이터 분석 기초 이론 (1)	2023.06.08
파이썬 기초 실습 (1)	2023.06.07

내 블로그 - 관리자 홈 전환	`Q` `Q`
새 글 쓰기	`W` `W`

글 수정 (권한 있는 경우)	`E` `E`
댓글 영역으로 이동	`C` `C`

이 페이지의 URL 복사	`S` `S`
맨 위로 이동	`T` `T`
티스토리 홈 이동	`H` `H`
단축키 안내	`Shift` + `/` `⇧` + `/`

Give all that you can⚡

데이터 분석 관련 필수 라이브러리 : pandas, matplotlib

0. 판다스 pandas: 데이터 분석 도구

1. 자료구조

시리즈 - 1차원 배열

데이터프레임 - 2차원 배열 (실무에서 가장 많이 사용)

2. 인덱스 활용

3. 산술 연산

4. 데이터 입출력

5. 데이터 살펴보기

6. 통계함수

7. 그래프 도구

8. 시각화 도구

'Daily > 디지털하나로' 카테고리의 다른 글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역

데이터 분석 관련 필수 라이브러리 : pandas, matplotlib

0. 판다스 pandas: 데이터 분석 도구

1. 자료구조

시리즈 - 1차원 배열

데이터프레임 - 2차원 배열 (실무에서 가장 많이 사용)

2. 인덱스 활용

3. 산술 연산

4. 데이터 입출력

5. 데이터 살펴보기

6. 통계함수

7. 그래프 도구

8. 시각화 도구

'Daily > 디지털하나로' 카테고리의 다른 글

관련글

댓글

티스토리툴바

개인정보

단축키

내 블로그

블로그 게시글

모든 영역