How to Compare two CSV files in Python using Pandas ( Steps )


Pandas is the best Python library for creating and manipulating dataframe. You can read CSV files, manipulate them and also export the final CSV file after manipulation. Suppose you have two CSV files and want to compare both of them. How you will do? In this entire tutorial, you will learn how to how to compare two CSV files in python using pandas using various methods.

Steps to Compare two CSV files in Python

In this section, you will know all the steps required for comparing two CSV files in python using pandas. Just follow all the steps for a better understanding.

Step 1: Create a Sample CSVs file

The first step is to create sample CSV files for the method. Firstly I will create two sample dataframes and then export each dataframe to a CSV file. It will use to read the CSV file and compare them.

Execute the below lines of code to create two sample CSVs files.

import pandas as pd
data1 = {"country":["India","USA","UK","Germany"],"dial_code":[91,1,44,49]}
df1 = pd.DataFrame(data1)
df1.to_csv("data1.csv",index=None)
data2 = {"country":["India","USA","UK","Germany","Australia","China"],"dial_code":[91,1,44,49,61,86]}
df2 = pd.DataFrame(data2)
df2.to_csv("data2.csv",index=None)
print(df1,"n")
print(df2)

Output

Sample dataframe creation for comparing between them
Sample dataframe creation for comparing between them

Step 2: Read  the CSV files

The second step is to read the created CSV files. You can read the CSV file using the pandas read_csv() method. Just pass the filename of the CSV file. It will convert the CSV file data to dataframe for manipulation.

Run the below lines of code to read your CSV files.

import pandas as pd
df1 = pd.read_csv("data1.csv")
df2 = pd.read_csv("data2.csv")
print(df1,"n")
print(df2,"n")

Output

  country  dial_code
0    India         91
1      USA          1
2       UK         44
3  Germany         49 

     country  dial_code
0      India         91
1        USA          1
2         UK         44
3    Germany         49
4  Australia         61
5      China         86 

 

Step 3: Implement the method to compare two CSV files in python using pandas

Now the CSV files have been read. Let’s compare both files. You will learn the various method to compare to CSV files in python.

Method 1 : Using the isin() method

The pandas package has a function isin() that allows you to check whether there are records in both the CSV files or not. If it finds then returns true else returns false. After that, we can get the values using the df[“boolean_result”].

Execute the below lines of code to compare the two CSV files.

# method 1 isin() metthod
import pandas as pd
df1 = pd.read_csv("data1.csv")
df2 = pd.read_csv("data2.csv")
c_result = df1[df1.apply(tuple,1).isin(df2.apply(tuple,1))]
print(c_result)

Output

Comparing two csv files using the isin() method
Comparing two csv files using the isin() method

Here I am also using the apply() method for comparing each row record with each CSV file.

Method 2: Compare CSV files using the merge() method

Pandas also have a function merge() that is useful in comparing the two CSV files. It performs an inner join, outer join or both join on columns. You have to just pass the dataframes you want to compare as a list inside the merge() method. The function will compare and returns the dataframe.

Run the below lines of code to compare the CSV files.

# method 2 merge() method
import pandas as pd
df1 = pd.read_csv("data1.csv")
df2 = pd.read_csv("data2.csv")
c_result_m = pd.merge(df1,df2)
print(c_result_m)

Output

Comparing two csv files using the merge() method
Comparing two csv files using the merge() method

Conclusion

Pandas is the best python package for manipulating large datasets. If you have CSV files then you can compare them using the above methods.

I hope you have liked this tutorial. If you have any suggestions or want to include some methods in this tutorial then you can contact us for more information.

Join our list

Subscribe to our mailing list and get interesting stuff and updates to your email inbox.

Thank you for signup. A Confirmation Email has been sent to your Email Address.

Something went wrong.





Source link