Table of Contents
Overview
== Open source data analysis and manipulation Python library
– released in 2008 by Wes McKinney
– written in Python, Cython, C
– Name is derived from “Panel Data
– is, next to Numpy, Scipy and Matplotlib, one of the most important data manipulation and analysis tools → All are compatible with each other
→ Our strength lies in the processing and evaluation of tabular data and time series



Data structures
– Pandas defines own data objects for data processing
→ form the basis for functions and tools
Series Object
– 1-dimensional
– Data structure with two arrays (one array as index + one array with data)
– can accept different types of data (ints, strings …)
– When adding several series, the indices are combined
DataFrame Object
– 2-dimensional
– contains an ordered collection of columns
– different columns can consist of different data types
– Each value is unique by a row and a column index
Panel Object
– 3-dimensional data sets
– consisting of dataframes
– Axes:
→ items – each item corresponds to a DataFrame contained inside.
→ major axis – index (rows) of each of the DataFrames.
→ minor axis – columns of each of the DataFrames.
Major Applications



0 Comments
3 Pingbacks