Remodelar datos: Pivot

Escribe una solución para pivotar los datos de manera que cada fila represente las temperaturas para un mes específico, y cada ciudad se muestre como una columna.

#pandas#table-reshaping#sorting-and-grouping
DataFrame weather
+-------------+--------+
| Column Name | Type   |
+-------------+--------+
| city        | object |
| month       | object |
| temperature | int    |
+-------------+--------+

El formato del resultado se muestra en el siguiente ejemplo.

Ejemplo 1:

Entrada:

+--------------+----------+-------------+
| city         | month    | temperature |
+--------------+----------+-------------+
| Jacksonville | January  | 13          |
| Jacksonville | February | 23          |
| Jacksonville | March    | 38          |
| Jacksonville | April    | 5           |
| Jacksonville | May      | 34          |
| ElPaso       | January  | 20          |
| ElPaso       | February | 6           |
| ElPaso       | March    | 26          |
| ElPaso       | April    | 2           |
| ElPaso       | May      | 43          |
+--------------+----------+-------------+

Salida:

+----------+--------+--------------+
| month    | ElPaso | Jacksonville |
+----------+--------+--------------+
| April    | 2      | 5            |
| February | 6      | 23           |
| January  | 20     | 13           |
| March    | 26     | 38           |
| May      | 43     | 34           |
+----------+--------+--------------+

Explicación:
- La tabla es pivotada, cada columna representa una ciudad y cada fila representa un mes en específico.

Solución

import pandas as pd


def pivotTable(weather: pd.DataFrame) -> pd.DataFrame:
    return weather.pivot(index='month',
                         columns='city',
                         values='temperature')


def test_pivot_table():
    columns = {
        'city': 'object',
        'month': 'object',
        'temperature': 'Int64'
    }

    data = [
        ['Jacksonville','January',13],
        ['Jacksonville','February',23],
        ['Jacksonville','March',38],
        ['Jacksonville','April',5],
        ['Jacksonville','May',34],
        ['ElPaso','January',20],
        ['ElPaso','February',6],
        ['ElPaso','March',26],
        ['ElPaso','April',2],
        ['ElPaso','May',43],
    ]
    weather = pd.DataFrame(data,
                           columns=columns.keys()).astype(columns)
    got = pivotTable(weather)
    expected = weather.pivot(index='month',
                             columns='city',
                             values='temperature')
    pd.testing.assert_frame_equal(got, expected)

slackmart blog © 2024