Months between two dates

  30 May 2019
  python

Importing Packages and Datasets

import pandas as pd
import numpy as np
start_date = ['2019-06-03', '2019-06-13', '2018-11-05', '2019-05-31', '2019-06-01', '2019-09-01']
end_date =  ['2019-08-31', '2019-08-23', '2018-11-25', '2019-07-1', '2019-07-31', '2019-10-25']
data = pd.DataFrame(list(zip(start_date,end_date)), columns = ['Start Date', 'End Date'])
Start DateEnd Date
02019-06-032019-08-31
12019-06-132019-08-23
22018-11-052018-11-25
32019-05-312019-07-1
42019-06-012019-07-31
52019-09-012019-10-25

Logic

# Store data index as a new column which will act as a primary key will be used later and initiate a blank new dataframe
data['Key'] = data.reset_index(drop = True).index
final = pd.DataFrame()
# Start the loop which will iterate on each particular combination of start data and end date -
for i in range(0,len(data)):
    start_date = data['Start Date'][i]
    end_date = data['End Date'][i]
    print("-----------------------------------------------------------------------------")
    print("Start Date is {0}, End Date is {1}".format(start_date, end_date))
    print("-----------------------------------------------------------------------------")
# If both dates fall in same month and year, then the date range is the same ( no consecutive months exists), check for same and proceed if the any of these are not same
# Store Month start date falling between two dates in  a new dataframe object
# Store Month end dates falling between two dates
# If first month start date is not equal to the start date of the date range, then we need to shift the index by one to accommodate for the enddate of the first month
    if ((pd.to_datetime(start_date).month != pd.to_datetime(end_date).month) | (pd.to_datetime(start_date).year != pd.to_datetime(end_date).year)):
        month_startdates = pd.DataFrame(pd.date_range(start_date, end_date, freq = 'MS'), columns=['Month_StartDate'])
        month_enddates = pd.DataFrame(pd.date_range(start_date, end_date, freq = 'M'), columns = ['Month_EndDate'])
        print("Ranges of months in the above date range : \n {0} \n {1}".format(month_startdates, month_enddates))
        if not pd.to_datetime(start_date).month == pd.to_datetime(month_startdates['Month_StartDate'][0]).month:
            month_startdates.index = range(1,len(month_startdates)+1)
        df = pd.concat([month_startdates,month_enddates], axis = 1)
        print("\n After adjusting for indexes : \n {0} ".format(df))
        df['Key'] = i
        final = final.append(df).reset_index(drop = True)
        print("\n Map it to orignal index, so that it can be merged with oringal DF : \n {0}".format(df))
    else:
# If month and year of both start date and end date are same, simply, create the record with the same
        final.append({'Month_StartDate':pd.to_datetime(start_date),
              'Month_EndDate': pd.to_datetime(end_date),
              'Key': i}, ignore_index =True)

Example 1 : When i = 1

-----------------------------------------------------------------------------
Start Date is 2019-06-13, End Date is 2019-08-23
-----------------------------------------------------------------------------
Ranges of months in the above date range :
   Month_StartDate
0      2019-07-01
1      2019-08-01
   Month_EndDate
0    2019-06-30
1    2019-07-31

 After adjusting for indexes :
   Month_StartDate Month_EndDate
0             NaT    2019-06-30
1      2019-07-01    2019-07-31
2      2019-08-01           NaT

 Map it to orignal index, so that it can be merged with oringal DF :
   Month_StartDate Month_EndDate  Key
0             NaT    2019-06-30    1
1      2019-07-01    2019-07-31    1
2      2019-08-01           NaT    1

Example 2 : When i = 3

-----------------------------------------------------------------------------
Start Date is 2019-05-31, End Date is 2019-07-1
-----------------------------------------------------------------------------
Ranges of months in the above date range :
   Month_StartDate
0      2019-06-01
1      2019-07-01
   Month_EndDate
0    2019-05-31
1    2019-06-30

 After adjusting for indexes :
   Month_StartDate Month_EndDate
0             NaT    2019-05-31
1      2019-06-01    2019-06-30
2      2019-07-01           NaT

 Map it to orignal index, so that it can be merged with oringal DF :
   Month_StartDate Month_EndDate  Key
0             NaT    2019-05-31    3
1      2019-06-01    2019-06-30    3
2      2019-07-01           NaT    3

Merge with orignal data on key and index

data = pd.merge(data, final, how = 'left')
data
Start DateEnd DateKeyMonth_StartDateMonth_EndDate
02019-06-032019-08-310NaT2019-06-30
12019-06-032019-08-3102019-07-012019-07-31
22019-06-032019-08-3102019-08-012019-08-31
32019-06-132019-08-231NaT2019-06-30
42019-06-132019-08-2312019-07-012019-07-31
52019-06-132019-08-2312019-08-01NaT
62018-11-052018-11-252NaTNaT
72019-05-312019-07-13NaT2019-05-31
82019-05-312019-07-132019-06-012019-06-30
92019-05-312019-07-132019-07-01NaT
102019-06-012019-07-3142019-06-012019-06-30
112019-06-012019-07-3142019-07-012019-07-31
122019-09-012019-10-2552019-09-012019-09-30
132019-09-012019-10-2552019-10-01NaT

Adjustment for missing start and end date

data.loc[data['Month_StartDate'].isnull(),'Month_StartDate'] = pd.to_datetime(data['Start Date'])
data.loc[data['Month_EndDate'].isnull(),'Month_EndDate'] = pd.to_datetime(data['End Date'])
data
Start DateEnd DateKeyMonth_StartDateMonth_EndDate
02019-06-032019-08-3102019-06-032019-06-30
12019-06-032019-08-3102019-07-012019-07-31
22019-06-032019-08-3102019-08-012019-08-31
32019-06-132019-08-2312019-06-132019-06-30
42019-06-132019-08-2312019-07-012019-07-31
52019-06-132019-08-2312019-08-012019-08-23
62018-11-052018-11-2522018-11-052018-11-25
72019-05-312019-07-132019-05-312019-05-31
82019-05-312019-07-132019-06-012019-06-30
92019-05-312019-07-132019-07-012019-07-01
102019-06-012019-07-3142019-06-012019-06-30
112019-06-012019-07-3142019-07-012019-07-31
122019-09-012019-10-2552019-09-012019-09-30
132019-09-012019-10-2552019-10-012019-10-25

Jupyter Notebook Link - Months between two dates