Projects STRLCPY maigret Commits dcf5181e
🤬
  • Fixed several false positives, improved statistics info (#368)

    * Fixed several false positives, improved statistics info
    
    * Updated site list and statistics
  • Loading...
  • Soxoj committed with GitHub 3 years ago
    dcf5181e
    1 parent 61452d56
Revision indexing in progress... (symbol navigation in revisions will be accurate after indexed)
  • ■ ■ ■ ■ ■ ■
    maigret/resources/data.json
    skipped 5202 lines
    5203 5203   ],
    5204 5204   "checkType": "message",
    5205 5205   "presenceStrs": [
    5206  - "userStatsTitle"
     5206 + "<meta content=\"name=Profile"
     5207 + ],
     5208 + "absenceStrs": [
     5209 + "<title>Foursquare "
    5207 5210   ],
    5208 5211   "alexaRank": 3413,
    5209  - "urlMain": "https://ru.foursquare.com/",
    5210  - "url": "https://ru.foursquare.com/{username}",
     5212 + "urlMain": "https://foursquare.com/",
     5213 + "url": "https://foursquare.com/{username}",
    5211 5214   "usernameClaimed": "adam",
    5212 5215   "usernameUnclaimed": "noonewouldeverusethis7"
    5213 5216   },
    skipped 1096 lines
    6310 6313   ],
    6311 6314   "checkType": "message",
    6312 6315   "absenceStrs": [
    6313  - "Page not found."
     6316 + "Page not found"
     6317 + ],
     6318 + "presenseStrs": [
     6319 + "title=\"Gumroad\""
    6314 6320   ],
    6315 6321   "alexaRank": 4728,
    6316 6322   "urlMain": "https://www.gumroad.com/",
    skipped 2540 lines
    8857 8863   ],
    8858 8864   "checkType": "message",
    8859 8865   "absenceStrs": [
    8860  - "\u0417\u0434\u0435\u0441\u044c \u043f\u043e\u043a\u0430 \u043d\u0438\u0447\u0435\u0433\u043e \u043d\u0435\u0442"
     8866 + "\u041f\u043e \u0412\u0430\u0448\u0435\u043c\u0443 \u0437\u0430\u043f\u0440\u043e\u0441\u0443 \u043d\u0438\u0447\u0435\u0433\u043e \u043d\u0435 \u043d\u0430\u0439\u0434\u0435\u043d\u043e"
     8867 + ],
     8868 + "presenseStrs": [
     8869 + "<span>\u041b\u044e\u0434\u0438</span>"
    8861 8870   ],
    8862 8871   "alexaRank": 6409,
    8863 8872   "urlMain": "https://mirtesen.ru",
    skipped 1302 lines
    10166 10175   "tags": [
    10167 10176   "ru"
    10168 10177   ],
    10169  - "checkType": "message",
    10170  - "absenceStrs": [
    10171  - "404 - Not Found"
    10172  - ],
     10178 + "checkType": "status_code",
    10173 10179   "alexaRank": 25200,
    10174 10180   "urlMain": "https://overclockers.ru",
    10175 10181   "url": "https://overclockers.ru/cpubase/user/{username}",
    skipped 538 lines
    10714 10720   "checkType": "message",
    10715 10721   "absenceStrs": [
    10716 10722   "Hmm, it seems that you've come across an invalid username",
    10717  - "404 Not Found"
     10723 + "404 Not Found",
     10724 + "Member Not Found"
     10725 + ],
     10726 + "presenseStrs": [
     10727 + "profile on Planet Minecraft to see their public Minecraft community activity"
    10718 10728   ],
    10719 10729   "alexaRank": 9050,
    10720 10730   "urlMain": "https://www.planetminecraft.com",
    skipped 2130 lines
    12851 12861   "tags": [
    12852 12862   "music"
    12853 12863   ],
    12854  - "checkType": "status_code",
     12864 + "checkType": "message",
     12865 + "presenseStrs": [
     12866 + "Profile: "
     12867 + ],
     12868 + "absenceStrs": [
     12869 + "Smule | Page Not Found (404)"
     12870 + ],
    12855 12871   "alexaRank": 11742,
    12856 12872   "urlMain": "https://www.smule.com/",
    12857 12873   "url": "https://www.smule.com/{username}",
    skipped 259 lines
    13117 13133   "us"
    13118 13134   ],
    13119 13135   "headers": {
    13120  - "authorization": "Bearer BQC-v69M-AcXsPLrSktz0Era-J2P1SXWB42HLKRHnCNpj00jLEbbbDFpIFo1UhBKrHrL7FqLQd-X4MIuhFo"
     13136 + "authorization": "Bearer BQBFTijjpshGAhX7n9-sO46wb8zJIkhu6TT3Ss7b-0V1dw_jXZhcff1agUpqRgbhznOG8pSIRlHtJAtd2TU"
    13121 13137   },
    13122 13138   "errors": {
    13123 13139   "Spotify is currently not available in your country.": "Access denied in your country, use proxy/vpn"
    skipped 1849 lines
    14973 14989   "video"
    14974 14990   ],
    14975 14991   "headers": {
    14976  - "Authorization": "jwt eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHAiOjE2NDExNzg4NjAsInVzZXJfaWQiOm51bGwsImFwcF9pZCI6NTg0NzksInNjb3BlcyI6InB1YmxpYyIsInRlYW1fdXNlcl9pZCI6bnVsbH0.9rznMue0JmX9SAPuWQDIYR-mmsozFq5PoKUvlvElpkQ"
     14992 + "Authorization": "jwt eyJ0eXAiOiJKV1QiLCJhbGciOiJIUzI1NiJ9.eyJleHAiOjE2NDU4Nzg1NDAsInVzZXJfaWQiOm51bGwsImFwcF9pZCI6NTg0NzksInNjb3BlcyI6InB1YmxpYyIsInRlYW1fdXNlcl9pZCI6bnVsbH0.Bs6VBcKPsl-5dqoThdAImBIex1mas1UcyG2pSnIYqYk"
    14977 14993   },
    14978 14994   "activation": {
    14979 14995   "url": "https://vimeo.com/_rv/viewer",
    skipped 14111 lines
  • ■ ■ ■ ■ ■ ■
    maigret/sites.py
    skipped 449 lines
    450 450   for tag in filter(lambda x: not is_country_tag(x), site.tags):
    451 451   tags[tag] = tags.get(tag, 0) + 1
    452 452   
    453  - output += f"Enabled/total sites: {total_count - disabled_count}/{total_count}\n\n"
    454  - output += f"Incomplete checks: {message_checks_one_factor}/{message_checks} (false positive risks)\n\n"
     453 + enabled_perc = round(100*(total_count-disabled_count)/total_count, 2)
     454 + output += f"Enabled/total sites: {total_count - disabled_count}/{total_count} = {enabled_perc}%\n\n"
     455 + 
     456 + checks_perc = round(100*message_checks_one_factor/message_checks, 2)
     457 + output += f"Incomplete checks: {message_checks_one_factor}/{message_checks} = {checks_perc}% (false positive risks)\n\n"
    455 458   
    456 459   top_urls_count = 20
    457 460   output += f"Top {top_urls_count} profile URLs:\n"
    skipped 15 lines
  • ■ ■ ■ ■ ■ ■
    sites.md
    skipped 248 lines
    249 249  1. ![](https://www.google.com/s2/favicons?domain=https://forum.xda-developers.com) [XDA (https://forum.xda-developers.com)](https://forum.xda-developers.com)*: top 5K, apps, forum*, search is disabled
    250 250  1. ![](https://www.google.com/s2/favicons?domain=https://i.thechive.com/) [Thechive (https://i.thechive.com/)](https://i.thechive.com/)*: top 5K, us*
    251 251  1. ![](https://www.google.com/s2/favicons?domain=https://999.md) [999.md (https://999.md)](https://999.md)*: top 5K, freelance, md, shopping*
    252  -1. ![](https://www.google.com/s2/favicons?domain=https://ru.foursquare.com/) [Foursquare (https://ru.foursquare.com/)](https://ru.foursquare.com/)*: top 5K, geosocial, in*
     252 +1. ![](https://www.google.com/s2/favicons?domain=https://foursquare.com/) [Foursquare (https://foursquare.com/)](https://foursquare.com/)*: top 5K, geosocial, in*
    253 253  1. ![](https://www.google.com/s2/favicons?domain=https://4pda.ru/) [4pda (https://4pda.ru/)](https://4pda.ru/)*: top 5K, ru*
    254 254  1. ![](https://www.google.com/s2/favicons?domain=https://www.weforum.org) [Weforum (https://www.weforum.org)](https://www.weforum.org)*: top 5K, forum, us*
    255 255  1. ![](https://www.google.com/s2/favicons?domain=http://www.techspot.com/community/) [techspot.com (http://www.techspot.com/community/)](http://www.techspot.com/community/)*: top 5K, forum, us*
    skipped 2343 lines
    2599 2599  1. ![](https://www.google.com/s2/favicons?domain=https://www.hozpitality.com) [hozpitality (https://www.hozpitality.com)](https://www.hozpitality.com)*: top 100M*
    2600 2600  1. ![](https://www.google.com/s2/favicons?domain=https://kazanlashkigalab.com) [kazanlashkigalab.com (https://kazanlashkigalab.com)](https://kazanlashkigalab.com)*: top 100M, kz*
    2601 2601   
    2602  -Alexa.com rank data fetched at (2022-02-26 11:41:48.847517 UTC)
     2602 +Alexa.com rank data fetched at (2022-02-26 12:19:53.127789 UTC)
    2603 2603  ## Statistics
    2604 2604   
    2605  -Enabled/total sites: 2447/2595
     2605 +Enabled/total sites: 2447/2595 = 94.3%
    2606 2606   
    2607  -Incomplete checks: 586/1978 (false positive risks)
     2607 +Incomplete checks: 582/1978 = 29.42% (false positive risks)
    2608 2608   
    2609 2609  Top 20 profile URLs:
    2610 2610  - (796) `{urlMain}/index/8-0-{username} (uCoz)`
    skipped 23 lines
    2634 2634  - (40) `NO_TAGS` (non-standard)
    2635 2635  - (24) `coding`
    2636 2636  - (23) `photo`
    2637  -- (19) `news`
     2637 +- (18) `news`
    2638 2638  - (18) `blog`
    2639  -- (18) `music`
     2639 +- (17) `music`
    2640 2640  - (15) `tech`
    2641 2641  - (13) `freelance`
    2642 2642  - (12) `sharing`
    skipped 10 lines
Please wait...
Page is in error, reload to recover