---
title: "Rows and Columns"
slug: "rows-and-columns"
updated: 2025-10-28T00:01:14Z
published: 2025-10-28T00:01:14Z
---

> ## Documentation Index
> Fetch the complete documentation index at: https://docs.imat.io/llms.txt
> Use this file to discover all available pages before exploring further.

# Rows and Columns

At times it becomes necessary to manipulate the data in columns and rows to clean up data or make accessing the data more efficient.

## Apply() Function

The `.apply()` function applies a function to every row or column in a data frame. There are two main applications for using `.apply()`:

- inplace changes
- new column creation

An inplace change generally involves cleaning up the data or converting the data type. For example, there is an extra comma found in the *full_name* column in the table below. It is possible to remove that comma using an inplace change.

|  | extract.age | extract.patient_full_name | extract.patient_id |
| --- | --- | --- | --- |
| 0 | 43 | Gail Hill, | 94451 |
| 1 | 43 | Gail Hill, | 94451 |
| 2 | 36 | Rhoda Sweet, | 92830 |
| 3 | 13 | Jescie Lyons, | 92721 |
| 4 | 70 | Lyle Cash, | 92570 |

To remove the comma, we can create a custom function and use `.apply()`:

```python
dataframe['extract.patient_full_name'] = dataframe['extract.patient_full_name'].apply(lambda value: value.replace(',', ''))
```

The code above is applying a lambda function, which removes commas from the *extract.patient_full_name* column and then overwrites the old *extract.patient_full_name* column.

|  | extract.age | extract.patient_full_name | extract.patient_id |
| --- | --- | --- | --- |
| 0 | 43 | Gail Hill | 94451 |
| 1 | 43 | Gail Hill | 94451 |
| 2 | 36 | Rhoda Sweet | 92830 |
| 3 | 13 | Jescie Lyons | 92721 |
| 4 | 70 | Lyle Cash | 92570 |

Another use for `.apply` is to convert columns to numeric, date, or object (string) types by including the line `import pandas as pd` at the beginning of the report. By converting strings to numeric objects, the values can be used in mathematical functions.

```python
import pandas as pd
...
dataframe = dataframe.apply(pd.to_numeric, errors="ignore")
```

---

### Splitting Columns

Splitting a column can be useful if a column contains multiple pieces of information or if a complex operation needs an intermediate calculation for simplicity or efficiency.

In the above table, it may be desirable to split the full name into two columns for the first and last names. To do so, use the `.apply()` function once again:

```python
dataframe['extract.patient_first_name'] = dataframe['extract.patient_full_name'].apply(lambda value: value.split()[0])
dataframe['extract.patient_last_name'] = dataframe['extract.patient_full_name'].apply(lambda value: value.split()[1])
```

|  | extract.age | extract.patient_full_name | extract.patient_id | extract.patient_first_name | extract.patient_last_name |
| --- | --- | --- | --- | --- | --- |
| 0 | 43 | Gail Hill | 94451 | Gail | Hill |
| 1 | 43 | Gail Hill | 94451 | Gail | Hill |
| 2 | 36 | Rhoda Sweet | 92830 | Rhoda | Sweet |
| 3 | 13 | Jescie Lyons | 92721 | Jescie | Lyons |
| 4 | 70 | Lyle Cash | 92570 | Lyle | Cash |

---

### Combining Columns

It is also possible to combine columns. In the example below, the first name and patient name concatenate to become a single column:

```python
dataframe['nameId'] = dataframe.apply(lambda row: row['extract.patient_first_name'] + str(row['extract.patient_id']), axis=1)
```

This time the `.apply()` function is working across the entire row so that multiple columns can be used.

|  | extract.age | extract.patient_full_name | extract.patient_id | extract.patient_first_name | extract.patient_last_name | nameId |
| --- | --- | --- | --- | --- | --- | --- |
| 0 | 43 | Gail Hill | 94451 | Gail | Hill | Gail94451 |
| 1 | 43 | Gail Hill | 94451 | Gail | Hill | Gail94451 |
| 2 | 36 | Rhoda Sweet | 92830 | Rhoda | Sweet | Rhoda92830 |
| 3 | 13 | Jescie Lyons | 92721 | Jescie | Lyons | Jescie92721 |
| 4 | 70 | Lyle Cash | 92570 | Lyle | Cash | Lyle92570 |

---

## Useful Column Functions

The list below contains some of the more useful pre-built functions available when using a data frame. Either assign a variable to these functions, `dataframe_sum = dataframe['extract.age'].sum()` or print the function directly, `print(dataframe['extract.age'].sum())`.

- mean  
`dataframe['extract.age'].mean()` returns 41.0
- sum  
`dataframe['extract.age'].sum()` returns 205
- unique  
`dataframe['extract.age'].unique()` returns [43 36 13 70]
- tolist  
`dataframe['extract.age'].tolist()` returns [43, 43, 36, 13, 70]
- describe  
`dataframe['extract.age'].describe()`returns the following information:  
  

```python
count     5.000000
mean     41.000000
std      20.359273
min      13.000000
25%      36.000000
50%      43.000000
75%      43.000000
max      70.000000
Name: extract.age, dtype: float64
```

`dataframe.describe()` returns the following table:

|  | extract.age | extract.patient_id |
| --- | --- | --- |
| count | 5.000000 | 5.000000 |
| mean | 41.000000 | 93404.600000 |
| std | 20.359273 | 959.679269 |
| min | 13.000000 | 92570.000000 |
| 25% | 36.000000 | 92721.000000 |
| 50% | 43.000000 | 93830.000000 |
| 75% | 43.000000 | 94451.000000 |
| max | 70.000000 | 94451.000000 |

---

## Stack and Unstack Functions

The `.stack()` function creates a multi-index based on the columns. To print the function, use `print(dataframe.stack())`, where *dataframe* is the variable assigned to the query.

| 0 | extract.age | 43 |
| --- | --- | --- |
|  | extract.patient_full_name | Gail Hill |
|  | extract.patient_id | 94451 |
|  | extract.patient_first_name | Gail |
|  | extract.patient_last_name | Hill |
|  | nameId | Gail94451 |
| 1 | extract.age | 43 |
|  | extract.patient_full_name | Gail Hill |
|  | extract.patient_id | 94451 |
|  | extract.patient_first_name | Gail |
|  | extract.patient_last_name | Hill |
|  | nameId | Gail94451 |
| 2 | extract.age | 36 |
|  | extract.patient_full_name | Rhoda Sweet |
|  | extract.patient_id | 92830 |
|  | extract.patient_first_name | Rhoda |
|  | extract.patient_last_name | Sweet |
|  | nameId | Rhoda92830 |
| 3 | extract.age | 13 |
|  | extract.patient_full_name | Jescie Lyons |
|  | extract.patient_id | 92721 |
|  | extract.patient_first_name | Jescie |
|  | extract.patient_last_name | Lyons |
|  | nameId | Jescie92721 |
| 4 | extract.age | 70 |
|  | extract.patient_full_name | Lyle Cash |
|  | extract.patient_id | 92570 |
|  | extract.patient_first_name | Lyle |
|  | extract.patient_last_name | Cash |
|  | nameId | Lyle92570 |
| dtype: object |

`dataframe.loc[0, 'extract.age']` returns 43

The `.unstack()` function produces the opposite effect of the `.stack()` function.

---

## Transpose() Function

The `transpose()` function switches the rows and columns.

|  | extract.age | extract.patient_full_name | extract.patient_id | extract.patient_first_name | extract.patient_last_name | nameId |
| --- | --- | --- | --- | --- | --- | --- |
| 0 | 43 | Gail Hill | 94451 | Gail | Hill | Gail94451 |
| 1 | 43 | Gail Hill | 94451 | Gail | Hill | Gail94451 |
| 2 | 36 | Rhoda Sweet | 92830 | Rhoda | Sweet | Rhoda92830 |
| 3 | 13 | Jescie Lyons | 92721 | Jescie | Lyons | Jescie92721 |
| 4 | 70 | Lyle Cash | 92570 | Lyle | Cash | Lyle92570 |

Using the code below transposes the table:

```python
dataframe = dataframe.transpose()
```

|  | 0 | 1 | 2 | 3 | 4 |
| --- | --- | --- | --- | --- | --- |
| extract.age | 43 | 43 | 36 | 13 | 70 |
| extract.patient_full_name | Gail Hill | Gail Hill | Rhoda Sweet | Jescie Lyons | Lyle Cash |
| extract.patient_id | 94451 | 94451 | 92830 | 92721 | 92570 |
| extract.patient_first_name | Gail | Gail | Rhoda | Jescie | Lyle |
| extract.patient_last_name | Hill | Hill | Sweet | Lyons | Cash |
| nameId | Gail94451 | Gail94451 | Rhoda92830 | Jescie92721 | Lyle92570 |
