Function in Python Dataframe

Whenever we have a data in a dataframe, we might have the urge to change the values. Below, three situations will be investigated.

The first situation is the situation, whereby one field will be taken into account. The first approach is a one line creation of such transformation.

df["Birthdate"].astype(str).apply(lambda x : datetime.strptime(x, '%Y-%m-%d') if x != 'None' else '')

The dataframe is called df. One column can be addressed as df[“column name”]. This leads to a series that must be transformed into a string. That string is then translated into a date.

A second situation is a situation where a function is created. That function can be created with:

def _datum(s):
  if s!= 'None':
    return datetime.strptime(s, '%Y-%m-%d')
  else:
    return ''

That function can be used subsequently as:

df["Date_of_Death__c"].astype(str).apply(_datum)

A third situation is a situation where several fields are needed to create a new value. We then create a function like:

def lange_naam(row):
FirstName = row[4]
LastName = row[6]
if FirstName is not None and LastName is not None:
if len(FirstName + LastName) > 20:
return 'LangeNaam'
else:
return 'Valt wel Mee'
else:
return 'Zooi'

Then can then be used as:

df.apply(lange_naam, axis="columns")

Door tom