🤬
  • Updated to 2.2.0

    Major update! Code refactoring, reply retrieval, translated descriptions, speedier decryption of media, support for analyzing private groups, bug fixes and much more!
  • Loading...
  • Jordan Wildon committed with GitHub 2 years ago
    98ed769c
    1 parent 5a9176ab
Revision indexing in progress... (symbol navigation in revisions will be accurate after indexed)
  • ■ ■ ■ ■ ■ ■
    CHANGES.txt
     1 +v2.2.0 16.10.2022 -- Major update! Code refactoring, reply retrieval,
     2 + translated descriptions, speedier decryption of media, support for
     3 + analyzing private groups, bug fixes and much more!
     4 + 
    1 5   
    2 6  v2.1.10 25.08.2022 -- Bugs busted, solved initialization error, efficiency tweaks,
    3 7   added automatic translations, fixed issue with media archiving,
    skipped 33 lines
  • ■ ■ ■ ■ ■ ■
    README.md
    1 1   
    2 2   
    3  -Telepathy: An OSINT toolkit for investigating Telegram chats. Developed by Jordan Wildon. Version 2.1.10.
     3 +Telepathy: An OSINT toolkit for investigating Telegram chats. Developed by Jordan Wildon. Version 2.2.0.
    4 4   
    5 5   
    6 6  ## Installation
    skipped 14 lines
    21 21   
    22 22  ## Setup
    23 23   
    24  -On first use, Telepathy will ask for your Telegram API details (obtained from my.telegram.org). Once those are set up, it will prompt you for an authorization code which will be sent to your Telegram account. If you have two-factor authentication enabled, you'll be asked to input your Telegram password.
     24 +On first use, Telepathy will ask for your Telegram API details (obtained from my.telegram.org). Once those are set up, it will prompt you to enter your phone number again and then send an authorization code to your Telegram account. If you have two-factor authentication enabled, you'll be asked to input your Telegram password.
    25 25   
    26 26   
    27 27  ## Usage:
    skipped 42 lines
    70 70   
    71 71  Use this flag to include media archiving alongside a comprehensive scan. This makes the process take significantly longer and should also be used with caution: you'll download all media content from the target chat, and it's up to you to not store illegal files on your system.
    72 72   
     73 +Since 2.2.0, downloading all media files will also generate a CSV file listing the files' metadata.
     74 + 
    73 75  For example, this will run a comprehensive scan, including media archiving:
    74 76   
    75 77  ```
    skipped 6 lines
    82 84  Looks up a specified user ID. This will only work if your account has "encountered" the user before (for example, after archiving a group).
    83 85   
    84 86  ```
    85  -$ telepathy -u 0123456789
     87 +$ telepathy -t 0123456789 -u
    86 88  ```
    87 89   
    88 90   
    89 91  - **'--location', '-l' [COORDINATES]**
    90 92   
    91  -Finds users near to specified coordinates. Input should be longitude followed by latitude, seperated by a comma.
     93 +Finds users near to specified coordinates. Input should be longitude followed by latitude, seperated by a comma. This feature only works if your Telegram account has a profile image which is set to publicly viewable.
    92 94   
    93 95  ```
    94  -$ telepathy -l 51.5032973,-0.1217424
     96 +$ telepathy -t 51.5032973,-0.1217424 -l
    95 97  ```
    96 98   
    97 99  - **'--alt', '-a'**
    skipped 14 lines
    112 114  
    113 115  - **'--reply', '-r'**
    114 116   
    115  -Flag for enable the reply retrieval for the target channel, it will map users who replied in the channel and it will dump the full conversation chain
     117 +Flag for enable the reply in the channel, it will map users who replied in the channel and it will dump the full conversation chain
    116 118   
    117 119  ```
    118 120  $ telepathy -t [CHANNEL] -c -r
    skipped 24 lines
    143 145   - [x] Add location lookup.
    144 146   - [ ] Maximise compatibility of edgelists with Gephi.
    145 147   - [ ] Include sockpuppet account provisioning (creation of accounts from previous exported lists).
    146  - - [ ] Automated EXIF data report and analytics when media archiving is enabled.
     148 + - [ ] Listing who has admin rights in memberlists.
     149 + - [ ] Media downloaded in the background to increase efficiency.
     150 + - [ ] When media archiving is flagged, the location of downloaded content will be added to the archive file.
     151 + - [ ] Adding direct link to posts in the chat archive file
    147 152   
    148 153   
    149 154  ## feedback
    150 155   
    151  -Please send feedback to @jordanwildon on Twitter. You can follow [@TelepathyDB](twitter.com/TelepathyDB) for updates.
     156 +Please send feedback to @jordanwildon on Twitter. You can follow Telepathy updates at @TelepathyDB.
    152 157   
    153 158   
    154 159  ## Usage terms
    skipped 3 lines
    158 163   
    159 164  ## Credits
    160 165   
    161  -All tools created by Jordan Wildon (@jordanwildon). Special thanks go to [Giacomo Giallombardo](https://github.com/aaarghhh) for adding additional features and code refactoring, and [Alex Newhouse](twitter.com/AlexBNewhouse) for help and guidance with Telepathy v1.
     166 +All tools created by Jordan Wildon (@jordanwildon). Special thanks go to [Giacomo Giallombardo](https://github.com/aaarghhh) for adding additional features and code refactoring, and Alex Newhouse (@AlexBNewhouse) for his help with Telepathy v1.
    162 167   
    163 168  Where possible, credit for the use of this tool in published research is desired, but not required. This can either come in the form of crediting the author, or crediting Telepathy itself (preferably with a link).
    164 169   
  • ■ ■ ■ ■
    requirements.txt
    skipped 6 lines
    7 7  requests==2.28.1
    8 8  googletrans==4.0.0rc1
    9 9  pprintpp==0.4.0
    10  - 
     10 +cryptg==0.3.1
  • ■ ■ ■ ■
    telepathy/libs/const.py
    1 1  __author__ = "Jordan Wildon (@jordanwildon)"
    2 2  __license__ = "MIT License"
    3  -__version__ = "2.1.10"
     3 +__version__ = "2.2.0"
    4 4  __maintainer__ = "Jordan Wildon"
    5 5  __email__ = "[email protected]"
    6 6  __status__ = "Development"
    skipped 51 lines
  • ■ ■ ■ ■ ■ ■
    telepathy/libs/utils.py
    skipped 81 lines
    82 82   "message_text": mess_txt,
    83 83   }
    84 84   
     85 +def process_description(desc, user_lang):
     86 + if desc is not None:
     87 + desc_txt = '"' + desc + '"'
     88 + else:
     89 + desc_txt = "none"
     90 + 
     91 + if desc_txt != "none":
     92 + translator = Translator()
     93 + detection = translator.detect(desc_txt)
     94 + language_code = detection.lang
     95 + translation_confidence = detection.confidence
     96 + translation = translator.translate(desc_txt, dest=user_lang)
     97 + original_language = translation.src
     98 + translated_text = translation.text
     99 + else:
     100 + original_language = user_lang
     101 + translated_text = "n/a"
     102 + translation_confidence = "n/a"
     103 + 
     104 + return {
     105 + "original_language": original_language,
     106 + "translated_text": translated_text,
     107 + "translation_confidence": translation_confidence,
     108 + "description_text": desc_txt,
     109 + }
     110 + 
    85 111  def color_print_green(first_string,second_string):
    86 112   print(
    87 113   Fore.GREEN
    skipped 49 lines
  • ■ ■ ■ ■ ■ ■
    telepathy/telepathy.py
    skipped 4 lines
    5 5   An OSINT toolkit for investigating Telegram chats.
    6 6  """
    7 7   
     8 +from tokenize import group
    8 9  import pandas as pd
    9 10  import datetime
    10 11  import requests
    skipped 7 lines
    18 19  import re
    19 20  import textwrap
    20 21  import time
     22 +import pprint
    21 23   
    22 24  from libs.utils import (
    23 25   print_banner,
    24 26   color_print_green,
    25 27   populate_user,
    26 28   process_message,
     29 + process_description,
    27 30   parse_tg_date,
    28 31   parse_html_page
    29 32  )
    skipped 16 lines
    46 49  from telethon.utils import get_display_name, get_message_id
    47 50  from alive_progress import alive_bar
    48 51  from bs4 import BeautifulSoup
     52 +import pikepdf
     53 +from hachoir.parser import createParser
     54 +from hachoir.metadata import extractMetadata
     55 + 
    49 56   
    50 57   
    51 58  @click.command()
    skipped 57 lines
    109 116   location_check = False
    110 117   last_date = None
    111 118   chunk_size = 1000
    112  - forwards_check = False
    113 119   filetime = datetime.datetime.now().strftime("%Y_%m_%d-%H_%M")
    114 120   filetime_clean = str(filetime)
    115 121   
    skipped 103 lines
    219 225   group_description = web_req["group_description"]
    220 226   total_participants = web_req["total_participants"]
    221 227   
     228 + _desc = process_description(
     229 + group_description, user_language
     230 + )
     231 + description_text = _desc["group_description"]
     232 + original_language = _mess[
     233 + "original_language"
     234 + ]
     235 + translated_description = _desc["translated_text"]
     236 + 
    222 237   if Dialog.entity.broadcast is True:
    223 238   chat_type = "Channel"
    224 239   elif Dialog.entity.megagroup is True:
    skipped 26 lines
    251 266   filetime,
    252 267   Dialog.entity.title,
    253 268   group_description,
     269 + translated_description,
    254 270   total_participants,
    255 271   group_username,
    256 272   group_url,
    skipped 10 lines
    267 283   "Access Date",
    268 284   "Title",
    269 285   "Description",
     286 + "Translated description",
    270 287   "Total participants",
    271 288   "Username",
    272 289   "URL",
    skipped 28 lines
    301 318   if "https://t.me/+" in t:
    302 319   t = t.replace('https://t.me/+', 'https://t.me/joinchat/')
    303 320   
    304  - 
    305  - save_directory = telepathy_file + alphanumeric
    306  - try:
    307  - os.makedirs(save_directory)
    308  - except FileExistsError:
    309  - pass
     321 + if basic is True or comp_check is True:
     322 + save_directory = telepathy_file + alphanumeric
     323 + try:
     324 + os.makedirs(save_directory)
     325 + except FileExistsError:
     326 + pass
    310 327   
    311 328   # Creating logfile
    312 329   log_file = telepathy_file + "log.csv"
    skipped 5 lines
    318 335   except FileExistsError:
    319 336   pass
    320 337   
    321  - if basic == True:
     338 + if basic == True and comp_check == False:
    322 339   color_print_green(" [!] ", "Performing basic scan")
    323 340   elif comp_check == True:
    324 341   color_print_green(" [!] ", "Performing comprehensive scan")
    skipped 81 lines
    406 423   
    407 424   group_description = web_req["group_description"]
    408 425   total_participants = web_req["total_participants"]
     426 + 
     427 + _desc = process_description(
     428 + group_description, user_language
     429 + )
     430 + description_text = _desc["description_text"]
     431 + original_language = _desc[
     432 + "original_language"
     433 + ]
     434 + 
     435 + translated_description = _desc["translated_text"]
     436 + 
    409 437   preferredWidth = 70
    410  - descript = Fore.GREEN + "Description: " + Style.RESET_ALL + web_req["group_description"]
    411  - prefix = descript + " "
     438 + descript = Fore.GREEN + "Description: " + Style.RESET_ALL
     439 + prefix = descript
    412 440   wrapper_d = textwrap.TextWrapper(
    413  - initial_indent=descript,
     441 + initial_indent=prefix,
    414 442   width=preferredWidth,
    415 443   subsequent_indent=" ",
    416 444   )
     445 + 
     446 + trans_descript = Fore.GREEN + "Translated: " + Style.RESET_ALL
     447 + prefix = trans_descript
     448 + wrapper_td = textwrap.TextWrapper(
     449 + initial_indent=prefix,
     450 + width=preferredWidth,
     451 + subsequent_indent=" ",
     452 + )
     453 + 
     454 + group_description = ('"' + group_description + '"')
    417 455   
    418 456   if entity.broadcast is True:
    419 457   chat_type = "Channel"
    skipped 78 lines
    498 536   color_print_green(" ┬ Chat details", "")
    499 537   color_print_green(" ├ Title: ", str(entity.title))
    500 538   color_print_green(" ├ ", wrapper_d.fill(group_description))
     539 + if translated_description != group_description:
     540 + color_print_green(" ├ ", wrapper_td.fill(translated_description))
    501 541   color_print_green(
    502 542   " ├ Total participants: ", str(total_participants)
    503 543   )
    skipped 31 lines
    535 575   color_print_green(
    536 576   " â”” ", wrapper_r.fill(group_status)
    537 577   )
    538  - print("\n")
     578 + #print("\n")
    539 579   
    540 580   log.append(
    541 581   [
    542 582   filetime,
    543 583   entity.title,
    544 584   group_description,
     585 + translated_description,
    545 586   total_participants,
    546 587   found_participants,
    547 588   group_username,
    skipped 14 lines
    562 603   "Access Date",
    563 604   "Title",
    564 605   "Description",
     606 + "Translated description",
    565 607   "Total participants",
    566 608   "Participants found",
    567 609   "Username",
    skipped 41 lines
    609 651   if message.forward is not None:
    610 652   forward_count += 1
    611 653   
    612  - print("\n")
     654 + #print("\n")
    613 655   color_print_green(" [-] ", "Fetching forwarded messages...")
    614 656   
    615 657   progress_bar = (
    skipped 131 lines
    747 789   df02 = forwards_df.From.unique()
    748 790   unique_forwards = len(df02)
    749 791   
    750  - print("\n")
     792 + #print("\n")
    751 793   color_print_green(" [+] Forward scrape complete", "")
    752 794   color_print_green(" ┬ Statistics", "")
    753 795   color_print_green(
    skipped 22 lines
    776 818   " ├ Top forward source 5: ", str(forward_five)
    777 819   )
    778 820   color_print_green(" â”” Edgelist saved to: ", edgelist_file)
    779  - print("\n")
     821 + #print("\n")
    780 822   
    781 823   else:
    782 824   print(
    skipped 23 lines
    806 848   private_count = 0
    807 849   
    808 850   if media_archive is True:
     851 + files = []
     852 + print("\n")
    809 853   color_print_green(
    810 854   " [!] ", "Media content will be archived"
    811 855   )
    812 856   
    813  - print("\n")
    814 857   color_print_green(
    815 858   " [!] ", "Calculating number of messages..."
    816 859   )
    skipped 182 lines
    999 1042   else:
    1000 1043   views = "Not found"
    1001 1044   
    1002  - if message.reactions:
    1003  - if message.reactions.can_see_list:
    1004  - print("#### TODO: REACTIONS")
     1045 + #if message.reactions:
     1046 + #if message.reactions.can_see_list:
     1047 + #print(dir(message.reactions.results))
     1048 + #print("#### TODO: REACTIONS")
     1049 + 
     1050 + if media_archive == True:
     1051 + #add a progress bar for each file download
     1052 + if message.media:
     1053 + path = await message.download_media(
     1054 + file=media_directory
     1055 + )
     1056 + files.append(path)
     1057 + else:
     1058 + pass
     1059 + 
     1060 +
    1005 1061   
    1006 1062   message_list.append(
    1007 1063   [
    skipped 106 lines
    1114 1170   ]
    1115 1171   )
    1116 1172   
    1117  - if media_archive == True:
    1118  - if message.media:
    1119  - path = await message.download_media(
    1120  - file=media_directory
    1121  - )
    1122  - else:
    1123  - pass
     1173 +
    1124 1174   
    1125 1175   except ChannelPrivateError:
    1126 1176   private_count += 1
    skipped 24 lines
    1151 1201   time.sleep(0.5)
    1152 1202   bar()
    1153 1203   
    1154  - if len(replies_list) > 0:
    1155  - with open(
    1156  - reply_file_archive, "w+", encoding="utf-8"
    1157  - ) as rep_file:
    1158  - c_replies.to_csv(rep_file, sep=";")
     1204 + if reply_analysis is True:
     1205 + if len(replies_list) > 0:
     1206 + with open(
     1207 + reply_file_archive, "w+", encoding="utf-8"
     1208 + ) as rep_file:
     1209 + c_replies.to_csv(rep_file, sep=";")
    1159 1210   
    1160  - if len(user_replier_list) > 0:
    1161  - with open(
    1162  - reply_memberlist_filename, "w+", encoding="utf-8"
    1163  - ) as repliers_file:
    1164  - c_repliers.to_csv(repliers_file, sep=";")
     1211 + if len(user_replier_list) > 0:
     1212 + with open(
     1213 + reply_memberlist_filename, "w+", encoding="utf-8"
     1214 + ) as repliers_file:
     1215 + c_repliers.to_csv(repliers_file, sep=";")
    1165 1216   
    1166 1217   with open(
    1167 1218   file_archive, "w+", encoding="utf-8"
    skipped 90 lines
    1258 1309   df04 = c_archive.Display_name.unique()
    1259 1310   plength = len(df03)
    1260 1311   unique_active = len(df04)
     1312 + # one day this'll work out sleeping times
     1313 + # print(c_t_stats)
     1314 + 
     1315 + elif reply_analysis is True:
     1316 + if len(replies_list) > 0:
     1317 + replier_count = c_repliers["User id"].count()
     1318 + replier_value_count = c_repliers["User id"].value_counts()
     1319 + replier_df = replier_value_count.rename_axis(
     1320 + "unique_values"
     1321 + ).reset_index(name="counts")
     1322 + 
     1323 + top_replier_one = str(replier_df.iloc[0]["unique_values"])
     1324 + top_replier_value_one = replier_df.iloc[0]["counts"]
     1325 + top_replier_two = str(replier_df.iloc[1]["unique_values"])
     1326 + top_replier_value_two = replier_df.iloc[1]["counts"]
     1327 + top_replier_three = str(replier_df.iloc[2]["unique_values"])
     1328 + top_replier_value_three = replier_df.iloc[2]["counts"]
     1329 + top_replier_four = str(replier_df.iloc[3]["unique_values"])
     1330 + top_replier_value_four = replier_df.iloc[3]["counts"]
     1331 + top_replier_five = str(replier_df.iloc[4]["unique_values"])
     1332 + top_replier_value_five = replier_df.iloc[4]["counts"]
     1333 + 
     1334 + replier_one = (
     1335 + str(top_replier_one)
     1336 + + ", "
     1337 + + str(top_replier_value_one)
     1338 + + " replies"
     1339 + )
     1340 + replier_two = (
     1341 + str(top_replier_two)
     1342 + + ", "
     1343 + + str(top_replier_value_two)
     1344 + + " replies"
     1345 + )
     1346 + replier_three = (
     1347 + str(top_replier_three)
     1348 + + ", "
     1349 + + str(top_replier_value_three)
     1350 + + " replies"
     1351 + )
     1352 + replier_four = (
     1353 + str(top_replier_four)
     1354 + + ", "
     1355 + + str(top_replier_value_four)
     1356 + + " replies"
     1357 + )
     1358 + replier_five = (
     1359 + str(top_replier_five)
     1360 + + ", "
     1361 + + str(top_replier_value_five)
     1362 + + " replies"
     1363 + )
     1364 + 
     1365 + replier_count_df = c_repliers["User id"].unique()
     1366 + replier_length = len(replier_df)
     1367 + replier_unique = len(replier_count_df)
    1261 1368   
    1262 1369   else:
    1263 1370   pass
    1264  - # one day this'll work out sleeping times
    1265  - # print(c_t_stats)
    1266 1371   
    1267  - print("\n")
     1372 + #print("\n")
    1268 1373   color_print_green(" [+] Chat archive saved", "")
    1269 1374   color_print_green(" ┬ Chat statistics", "")
    1270 1375   color_print_green(
    skipped 20 lines
    1291 1396   " ├ Total unique posters: ", str(unique_active)
    1292 1397   )
    1293 1398  
    1294  - # add a figure for unique current posters who are active
    1295 1399   else:
    1296 1400   pass
    1297 1401   # timestamp analysis
    skipped 6 lines
    1304 1408   " â”” Archive saved to: ", str(file_archive)
    1305 1409   )
    1306 1410   
    1307  - if len(replies_list) > 0:
    1308  - middle_char = "├"
    1309  - if user_replier_list == 0:
    1310  - middle_char = "â””"
     1411 + if reply_analysis is True:
     1412 + if len(replies_list) > 0:
     1413 + middle_char = "├"
     1414 + if user_replier_list == 0:
     1415 + middle_char = "â””"
    1311 1416   
    1312  - print("\n")
    1313  - color_print_green(" [+] Replies analysis ", "")
    1314  - color_print_green(" ┬ Chat statistics", "")
    1315  - color_print_green(
    1316  - f" {middle_char} Archive of replies saved to: ",
    1317  - str(reply_file_archive),
    1318  - )
    1319  - if len(user_replier_list) > 0:
     1417 + #print("\n")
     1418 + color_print_green(" [+] Replies analysis ", "")
     1419 + color_print_green(" ┬ Chat statistics", "")
    1320 1420   color_print_green(
    1321  - " â”” Active members list who replied to messages, saved to: ",
    1322  - str(reply_memberlist_filename),
     1421 + f" {middle_char} Archive of replies saved to: ",
     1422 + str(reply_file_archive),
    1323 1423   )
     1424 + if len(user_replier_list) > 0:
     1425 + color_print_green(
     1426 + " â”” Active members list who replied to messages, saved to: ",
     1427 + str(reply_memberlist_filename),
     1428 + )
     1429 + 
     1430 + color_print_green(
     1431 + " ├ Top replier 1: ", str(replier_one)
     1432 + )
     1433 + color_print_green(
     1434 + " ├ Top replier 2: ", str(replier_two)
     1435 + )
     1436 + color_print_green(
     1437 + " ├ Top replier 3: ", str(replier_three)
     1438 + )
     1439 + color_print_green(
     1440 + " ├ Top replier 4: ", str(replier_four)
     1441 + )
     1442 + color_print_green(
     1443 + " ├ Top replier 5: ", str(replier_five)
     1444 + )
     1445 + color_print_green(
     1446 + " ├ Total unique repliers: ", str(replier_unique)
     1447 + )
     1448 + # add a figure for unique current posters who are active
    1324 1449   
    1325 1450   if forwards_check is True:
    1326 1451   if forward_count >= 15:
    skipped 54 lines
    1381 1506   c_f_unique = c_forwards.From.unique()
    1382 1507   unique_forwards = len(c_f_unique)
    1383 1508   
    1384  - print("\n")
     1509 + #print("\n")
    1385 1510   color_print_green(" [+] Edgelist saved", "")
    1386 1511   color_print_green(
    1387 1512   " ┬ Forwarded message statistics", ""
    skipped 33 lines
    1421 1546   color_print_green(
    1422 1547   " â”” Edgelist saved to: ", edgelist_file
    1423 1548   )
    1424  - print("\n")
     1549 + #print("\n")
    1425 1550   
    1426 1551   else:
    1427  - print("\n")
     1552 + #print("\n")
    1428 1553   color_print_green(
    1429 1554   " [!] Insufficient forwarded messages found",
    1430 1555   edgelist_file,
    skipped 5 lines
    1436 1561   my_user = None
    1437 1562   try:
    1438 1563   
    1439  - print(
    1440  - Fore.GREEN
    1441  - + " [+] "
    1442  - + Style.RESET_ALL
    1443  - + "User details for "
    1444  - + t
    1445  - )
    1446 1564   user = int(t)
    1447 1565   my_user = await client.get_entity(PeerUser(int(user)))
    1448 1566   
    skipped 76 lines
    1525 1643   + "_"
    1526 1644   + longitude
    1527 1645   + "_"
    1528  - + "locations.csv"
     1646 + + "locations_"
     1647 + + filetime_clean
     1648 + + ".csv"
    1529 1649   )
    1530 1650   
    1531 1651   locations_list = []
    skipped 70 lines
    1602 1722  
    1603 1723   with client:
    1604 1724   client.loop.run_until_complete(main())
    1605  - 
    1606 1725   
    1607 1726  if __name__ == "__main__":
    1608 1727   cli()
    skipped 1 lines
Please wait...
Page is in error, reload to recover