# Be careful with Inf values when using Pandas to calcualte the correlation between variables

- 1 min# Be careful with Inf values when using Pandas to calcualte the correlation between variables.

A wired thing had happened to me when I was exploring pairwise correlations among different variables stored in `pandas.DataFrame`

. My gold is to get the pairwise Pearson coefficients of variables in `pandas.DataFrame`

**A** with variables in `pandas.DataFrame`

**B**. There are multiple ways to perform such an analysis. I originally used `A.apply(lambda v: B.corrwith(v))`

. There were a few unexpected NAs present on the output. However, those NAs disappears if I implemented the calculation via `A.merge(B).corr()`

. So, **why is there a discrepancy?**

## Look in the the difference between `corr`

and `corrwith`

method.

```
import pandas as pd
import numpy as np
df = pd.DataFrame({'A':[1,2,3],'B':[3,2,np.Inf]})
```

`corr`

function tolerates the Inf values

```
df.corr()
```

A | B | |
---|---|---|

A | 1.0 | -1.0 |

B | -1.0 | 1.0 |

`corrwith`

cannot handle Inf values

```
df.A.to_frame().corrwith(df.B)
```

```
A NaN
dtype: float64
```

## A Workaround when Inf values can be ignored.

```
df.A.to_frame().corrwith(df.B.replace(np.Inf,np.nan))
```

```
A -1.0
dtype: float64
```