Data Analysis using Pandas¶
Pandas has become the defacto package for data analysis. In this workshop, we are going to use the basics of pandas to analyze the interests of today's group. We are going to use meetup.com's api and fetch the list of interests that are listed in each of our meetup.com profile. We will compute which interests are common, which are uncommon, and find out which of the two members have most similar interests. Lets get started by importing the essentials.
You would need meetup.com's python api and pandas installed.
import meetup.api import pandas as pd from IPython.display import Image, display, HTML from itertools import combinations
Next we need your meetup.com API. You will find it https://secure.meetup.com/meetup_api/key/ Also we need today's event id. The event id created under Chicago Pythonistas is 233460758 and that under Chicago Python user group is 236205125. Use the one that has the higher number of RSVPs so that you get more data points. As an additional exercise, you might go for merging the two sets of RSVPs - but that's not needed for the workshop.
API_KEY = '' event_id=''
The following function uses the api and loads the data into a pandas data frame. Note we are a bit sloppy both in style and how we load the data. In actual production code, we should add adequate logging with well-defined exceptions to indicate what's going wrong.
def get_members(event_id): client = meetup.api.Client(API_KEY) rsvps=client.GetRsvps(event_id=event_id, urlname='_ChiPy_') member_id = ','.join([str(i['member']['member_id']) for i in rsvps.results]) return client.GetMembers(member_id=member_id) def get_topics(members): topics = set() for member in members.results: try: for t in member['topics']: topics.add(t['name']) except: pass return list(topics) def df_topics(event_id): members = get_members(event_id=event_id) topics = get_topics(members) columns=['name','id','thumb_link'] + topics data =  for member in members.results: topic_vector = *len(topics) for topic in member['topics']: index = topics.index(topic['name']) topic_vector[index-1] = 1 try: data.append([member['name'], member['id'], member['photo']['thumb_link']] + topic_vector) except: pass return pd.DataFrame(data=data, columns=columns) #df.to_csv('output.csv', sep=";")
So you need to call the df_topics function with the event id and it would give you back a pandas dataframe containing basic information of a member and along with all possible interests. If the member has indicated interest, that column will have a one, if not then the column will have a zero.