Parse your Google Hangouts data and visualize your messages over time
Find a file
Jacob Manning f107b7cb25
Merge pull request #2 from jacobmanning/feature/check-for-conversation-state
Add check for missing conversation_state
2021-03-10 19:28:42 -05:00
raw Initial commit 2018-07-01 10:22:29 -04:00
.gitignore Ignore output directory 2018-07-01 10:52:14 -04:00
LICENSE Create LICENSE 2018-07-01 10:56:57 -04:00
parser.py Add check for missing conversation_state 2021-03-10 19:26:26 -05:00
README.md Add visualization support 2018-07-01 12:04:10 -04:00
requirements.txt Update versions for requirements 2018-07-01 12:03:59 -04:00
utils.py Reformat logging module 2018-07-06 20:23:40 -04:00
visualize.py Add visualization support 2018-07-01 12:04:10 -04:00

hangouts-parser

This repository parses conversation data from Google Hangouts and gives diagnostics on the number of messages in conversations. Two scripts are currently supported: parser.py and visualize.py. The parsing script parses raw JSON data from Google Takeout and creates pickled summary files for each conversation. The visualization script creates a histogram of messages over time using the pickled conversation summaries.

Usage

  1. Clone this repository
  2. Download your hangouts data
    • Navigate to Google Takeout
    • Choose "Select None" and manually select Hangouts
    • Download the data in zip format and move the Hangouts.json file into the raw folder in this repository
  3. Install dependencies via pip
    • pip install -r requirements.txt
    • No dependencies are required for the parser.py script, but visualize.py requires the dependencies
  4. Run the parser
    • Note: if you did not place your hangouts data as raw/Hangouts.json you can specify the path to the .json file as an argument to the parser.py script via the -f flag
python parser.py
  1. Run the visualization
    • The <conversation_id> can be found in the output of the parser.py script
python visualize.py -f output/<conversation_id>.pkl

License

This code is freely available under the GNU Public License (GPL).

Privacy notice

All of the data processing in these scripts happens locally on your computer. The data you provide to the script is NOT uploaded to an external server. Feel free to examine the code if you are concerned.

Acknowledgements

This repository was inspired by MasterScrat/Chatistics. Chatistics can parse Facebook Messenger and Telegram data, but not Hangouts group messages. I originally intended to contribute to that repository and add Hangouts group message support, but my design drifted far from the existing design in that repository so I created a new project. Shoutout to MasterScrat for great work and thanks for the inspiration!