Projects STRLCPY OnionSearch Commits 9aa506a5
🤬
  • 18 search engines are added, it is possible to choose which engines to request/to exclude, two display modes for the progress bar are available, the script now uses a random user agent when doing requests

  • Loading...
  • Gobarigo committed 4 years ago
    9aa506a5
    1 parent 727893cf
  • ■ ■ ■ ■ ■ ■
    README.md
    skipped 1 lines
    2 2  ## Educational purposes only
    3 3  [![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/)
    4 4   
    5  -OnionSearch is a script that scrapes urls on different .onion search engines. In 30 minutes you get 10,000 unique urls.
     5 +OnionSearch is a Python3 script that scrapes urls on different ".onion" search engines.
     6 +In 30 minutes you get thousands of unique urls.
     7 + 
    6 8  ## 💡 Prerequisite
    7 9  [Python 3](https://www.python.org/download/releases/3.0/)
    8 10  
    9  -## �� Search engines used
     11 +## �� Currently supported Search engines
    10 12  - Ahmia
    11  -- Torch
     13 +- TORCH
    12 14  - Darksearch io
    13 15  - OnionLand
     16 +- not Evil
     17 +- VisiTOR
     18 +- Dark Search Enginer
     19 +- Phobos
     20 +- Onion Search Server
     21 +- Grams
     22 +- Candle
     23 +- Tor Search Engine
     24 +- Torgle
     25 +- Onion Search Engine
     26 +- Tordex
     27 +- Tor66
     28 +- Tormax
     29 +- Haystack
     30 +- Multivac
     31 +- Evo Search
     32 +- Oneirun
     33 +- DeepLink
    14 34   
    15 35  ## 🛠️ Installation
     36 + 
    16 37  ```
    17 38  git clone https://github.com/megadose/OnionSearch.git
    18 39  cd OnionSearch
    19 40  pip3 install -r requirements.txt
     41 +pip3 install 'urllib3[socks]'
    20 42  python3 search.py -h
    21 43  ```
     44 + 
    22 45  ## 📈 Usage
     46 + 
    23 47  ```
    24  -python3 search.py [-h] --search "search" [--proxy 127.0.0.1:1337] [--output mylinks.txt]
    25  -python3 search.py --search "computer" --output computer.txt
     48 +usage: search.py [-h] [--proxy PROXY] [--output OUTPUT] [--limit LIMIT]
     49 + [--barmode BARMODE] [--engines [ENGINES [ENGINES ...]]]
     50 + [--exclude [EXCLUDE [EXCLUDE ...]]]
     51 + search
     52 + 
     53 +positional arguments:
     54 + search The search string or phrase
     55 + 
     56 +optional arguments:
     57 + -h, --help show this help message and exit
     58 + --proxy PROXY Set Tor proxy (default: 127.0.0.1:9050)
     59 + --output OUTPUT Output File (default: output.txt)
     60 + --limit LIMIT Set a max number of pages per engine to load
     61 + --barmode BARMODE Can be 'fixed' (default) or 'unknown'
     62 + --engines [ENGINES [ENGINES ...]]
     63 + Engines to request (default: full list)
     64 + --exclude [EXCLUDE [EXCLUDE ...]]
     65 + Engines to exclude (default: none)
     66 +```
     67 + 
     68 +### Examples
     69 + 
     70 +To request the string "computer" on all the engines to default file:
     71 +```
     72 +python3 search.py "computer"
    26 73  ```
     74 + 
     75 +To request all the engines but "Ahmia" and "Candle":
     76 +```
     77 +python3 search.py "computer" --proxy 127.0.0.1:1337 --exclude ahmia candle
     78 +```
     79 + 
     80 +To request only "Tor66", "DeepLink" and "Phobos":
     81 +```
     82 +python3 search.py "computer" --proxy 127.0.0.1:1337 --engines tor66 deeplink phobos
     83 +```
     84 + 
     85 +The same but limiting the number of page per engine to load to 3:
     86 +```
     87 +python3 search.py "computer" --proxy 127.0.0.1:1337 --engines tor66 deeplink phobos --limit 3
     88 +```
     89 + 
     90 +Please kindly note that the list of supported engines (and their keys) is given in the script help (-h).
     91 + 
     92 +### Output
     93 + 
     94 +The file written at the end of the process will be a csv containing the following columns:
     95 +```
     96 +"engine","name of the link","url"
     97 +```
     98 + 
     99 +The name and url strings are sanitized as much as possible, but there might still be some problems.
     100 + 
     101 + 
    27 102  ## 📝 License
    28 103  [GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.fr.html)
    29 104   
    skipped 2 lines
  • ■ ■ ■ ■ ■ ■
    search.py
    1  -import requests,json
     1 +import argparse
     2 +import math
     3 +import re
     4 +import time
     5 +from datetime import datetime
     6 +from random import choice
     7 + 
     8 +import requests
    2 9  from bs4 import BeautifulSoup
    3  -import argparse
    4 10  from tqdm import tqdm
    5  -parser = argparse.ArgumentParser()
    6  -required = parser.add_argument_group('required arguments')
     11 + 
     12 +desktop_agents = [
     13 + 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',
     14 + 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',
     15 + 'Mozilla/5.0 (Windows NT 10.0; Win64; x64) '
     16 + 'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',
     17 + 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_1) '
     18 + 'AppleWebKit/602.2.14 (KHTML, like Gecko) Version/10.0.1 Safari/602.2.14',
     19 + 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36',
     20 + 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_12_1) '
     21 + 'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',
     22 + 'Mozilla/5.0 (Macintosh; Intel Mac OS X 10_11_6) '
     23 + 'AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.98 Safari/537.36',
     24 + 'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.71 Safari/537.36',
     25 + 'Mozilla/5.0 (Windows NT 6.1; Win64; x64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',
     26 + 'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0'
     27 +]
     28 + 
     29 +supported_engines = [
     30 + "ahmia",
     31 + "torch",
     32 + "darksearchio",
     33 + "onionland",
     34 + "notevil",
     35 + "visitor",
     36 + "darksearchenginer",
     37 + "phobos",
     38 + "onionsearchserver",
     39 + "grams",
     40 + "candle",
     41 + "torsearchengine",
     42 + "torgle",
     43 + "onionsearchengine",
     44 + "tordex",
     45 + "tor66",
     46 + "tormax",
     47 + "haystack",
     48 + "multivac",
     49 + "evosearch",
     50 + "oneirun",
     51 + "deeplink",
     52 +]
     53 + 
     54 + 
     55 +def print_epilog():
     56 + epilog = "Supported engines: ".format(len(supported_engines))
     57 + for e in supported_engines:
     58 + epilog += " {}".format(e)
     59 + return epilog
     60 + 
     61 + 
     62 +parser = argparse.ArgumentParser(epilog=print_epilog())
    7 63  parser.add_argument("--proxy", default='localhost:9050', type=str, help="Set Tor proxy (default: 127.0.0.1:9050)")
    8 64  parser.add_argument("--output", default='output.txt', type=str, help="Output File (default: output.txt)")
     65 +parser.add_argument("search", type=str, help="The search string or phrase")
     66 +parser.add_argument("--limit", type=int, default=0, help="Set a max number of pages per engine to load")
     67 +parser.add_argument("--barmode", type=str, default="fixed", help="Can be 'fixed' (default) or 'unknown'")
     68 +parser.add_argument("--engines", type=str, action='append', help='Engines to request (default: full list)', nargs="*")
     69 +parser.add_argument("--exclude", type=str, action='append', help='Engines to exclude (default: none)', nargs="*")
    9 70   
    10  -parser.add_argument("--search",type=str, help="search")
    11 71  args = parser.parse_args()
    12 72  proxies = {'http': 'socks5h://{}'.format(args.proxy), 'https': 'socks5h://{}'.format(args.proxy)}
     73 +tqdm_bar_format = "{desc}: {percentage:3.0f}% |{bar}| {n_fmt:3s} / {total_fmt:3s} [{elapsed:5s} < {remaining:5s}]"
     74 +result = {}
     75 + 
     76 + 
     77 +def random_headers():
     78 + return {'User-Agent': choice(desktop_agents),
     79 + 'Accept': 'text/html,application/xhtml+xml,application/xml;q=0.9,image/webp,*/*;q=0.8'}
     80 + 
    13 81   
    14 82  def clear(toclear):
    15  - return(toclear.replace("\n","").replace(" ",""))
    16  -def clearn(toclear):
    17  - return(toclear.replace("\n"," "))
     83 + str = toclear.replace("\n", " ")
     84 + str = ' '.join(str.split())
     85 + return str
    18 86   
    19  -def scrape():
    20  - result = {}
    21  - ahmia = "http://msydqstlz2kzerdg.onion/search/?q="+args.search
    22  - response = requests.get(ahmia, proxies=proxies)
    23  - #print(response)
    24  - soup = BeautifulSoup(response.text, 'html.parser')
    25  - result['ahmia'] = []
    26  - #pageNumber = clear(soup.find("span", id="pageResultEnd").get_text())
    27  - for i in tqdm(soup.findAll('li', attrs = {'class' : 'result'}),desc="Ahmia"):
    28  - i = i.find('h4')
    29  - result['ahmia'].append({"name":clear(i.get_text()),"link":i.find('a')['href'].replace("/search/search/redirect?search_term=search&redirect_url=","")})
     87 + 
     88 +def ahmia(searchstr):
     89 + ahmia_url = "http://msydqstlz2kzerdg.onion/search/?q={}"
    30 90   
    31  - urlTorchNumber = "http://xmh57jrzrnw6insl.onion/4a1f6b371c/search.cgi?cmd=Search!&np=1&q="
    32  - req = requests.get(urlTorchNumber+args.search,proxies=proxies)
     91 + with tqdm(total=1, initial=0, desc="%20s" % "Ahmia", unit="req", ascii=False, ncols=120,
     92 + bar_format=tqdm_bar_format) as progress_bar:
     93 + response = requests.get(ahmia_url.format(searchstr), proxies=proxies, headers=random_headers())
     94 + soup = BeautifulSoup(response.text, 'html.parser')
     95 + link_finder("ahmia", soup)
     96 + progress_bar.update()
     97 + progress_bar.close()
     98 + 
     99 + 
     100 +def torch(searchstr):
     101 + torch_url = "http://xmh57jrzrnw6insl.onion/4a1f6b371c/search.cgi?cmd=Search!&np={}&q={}"
     102 + results_per_page = 10
     103 + max_nb_page = 100
     104 + if args.limit != 0:
     105 + max_nb_page = args.limit
     106 + 
     107 + with requests.Session() as s:
     108 + s.proxies = proxies
     109 + s.headers = random_headers()
     110 + 
     111 + req = s.get(torch_url.format(0, searchstr))
     112 + soup = BeautifulSoup(req.text, 'html.parser')
     113 + 
     114 + page_number = 1
     115 + for i in soup.find("table", attrs={"width": "100%"}).find_all("small"):
     116 + if i.get_text() is not None and "of" in i.get_text():
     117 + page_number = math.ceil(float(clear(i.get_text().split("-")[1].split("of")[1])) / results_per_page)
     118 + if page_number > max_nb_page:
     119 + page_number = max_nb_page
     120 + 
     121 + with tqdm(total=page_number, initial=0, desc="%20s" % "TORCH", unit="req", ascii=False, ncols=120,
     122 + bar_format=tqdm_bar_format) as progress_bar:
     123 + 
     124 + link_finder("torch", soup)
     125 + progress_bar.update()
     126 + 
     127 + # Usually range is 2 to n+1, but TORCH behaves differently
     128 + for n in range(1, page_number):
     129 + req = s.get(torch_url.format(n, searchstr))
     130 + soup = BeautifulSoup(req.text, 'html.parser')
     131 + link_finder("torch", soup)
     132 + progress_bar.update()
     133 + 
     134 + progress_bar.close()
     135 + 
     136 + 
     137 +def darksearchio(searchstr):
     138 + global result
     139 + result['darksearchio'] = []
     140 + darksearchio_url = "http://darksearch.io/api/search?query={}&page={}"
     141 + max_nb_page = 30
     142 + if args.limit != 0:
     143 + max_nb_page = args.limit
     144 + 
     145 + with requests.Session() as s:
     146 + s.proxies = proxies
     147 + s.headers = random_headers()
     148 + resp = s.get(darksearchio_url.format(searchstr, 1))
     149 + 
     150 + page_number = 1
     151 + if resp.status_code == 200:
     152 + resp = resp.json()
     153 + if 'last_page' in resp:
     154 + page_number = resp['last_page']
     155 + if page_number > max_nb_page:
     156 + page_number = max_nb_page
     157 + else:
     158 + return
     159 + 
     160 + with tqdm(total=page_number, initial=0, desc="%20s" % "DarkSearch (.io)", unit="req", ascii=False, ncols=120,
     161 + bar_format=tqdm_bar_format) as progress_bar:
     162 + 
     163 + link_finder("darksearchio", resp['data'])
     164 + progress_bar.update()
     165 + 
     166 + for n in range(2, page_number + 1):
     167 + resp = s.get(darksearchio_url.format(searchstr, n))
     168 + if resp.status_code == 200:
     169 + resp = resp.json()
     170 + link_finder("darksearchio", resp['data'])
     171 + progress_bar.update()
     172 + else:
     173 + # Current page results will be lost but we will try to continue after a short sleep
     174 + time.sleep(1)
     175 + 
     176 + progress_bar.close()
     177 + 
     178 + 
     179 +def onionland(searchstr):
     180 + onionlandv3_url = "http://3bbad7fauom4d6sgppalyqddsqbf5u5p56b5k5uk2zxsy3d6ey2jobad.onion/search?q={}&page={}"
     181 + max_nb_page = 100
     182 + if args.limit != 0:
     183 + max_nb_page = args.limit
     184 + 
     185 + with requests.Session() as s:
     186 + s.proxies = proxies
     187 + s.headers = random_headers()
     188 + 
     189 + resp = s.get(onionlandv3_url.format(searchstr, 1))
     190 + soup = BeautifulSoup(resp.text, 'html.parser')
     191 + 
     192 + page_number = 1
     193 + for i in soup.find_all('div', attrs={"class": "search-status"}):
     194 + approx_re = re.match(r"About ([,0-9]+) result(.*)",
     195 + clear(i.find('div', attrs={'class': "col-sm-12"}).get_text()))
     196 + if approx_re is not None:
     197 + nb_res = int((approx_re.group(1)).replace(",", ""))
     198 + results_per_page = 19
     199 + page_number = math.ceil(nb_res / results_per_page)
     200 + if page_number > max_nb_page:
     201 + page_number = max_nb_page
     202 + 
     203 + bar_max = None
     204 + if args.barmode == "fixed":
     205 + bar_max = max_nb_page
     206 + 
     207 + with tqdm(total=bar_max, initial=0, desc="%20s" % "OnionLand", unit="req", ascii=False, ncols=120,
     208 + bar_format=tqdm_bar_format) as progress_bar:
     209 + 
     210 + link_finder("onionland", soup)
     211 + progress_bar.update()
     212 + 
     213 + for n in range(2, page_number + 1):
     214 + resp = s.get(onionlandv3_url.format(searchstr, n))
     215 + soup = BeautifulSoup(resp.text, 'html.parser')
     216 + ret = link_finder("onionland", soup)
     217 + if ret < 0:
     218 + break
     219 + progress_bar.update()
     220 + 
     221 + progress_bar.close()
     222 + 
     223 + 
     224 +def notevil(searchstr):
     225 + notevil_url1 = "http://hss3uro2hsxfogfq.onion/index.php?q={}"
     226 + notevil_url2 = "http://hss3uro2hsxfogfq.onion/index.php?q={}&hostLimit=20&start={}&numRows={}&template=0"
     227 + max_nb_page = 20
     228 + if args.limit != 0:
     229 + max_nb_page = args.limit
     230 + 
     231 + # Do not use requests.Session() here (by experience less results would be got)
     232 + req = requests.get(notevil_url1.format(searchstr), proxies=proxies, headers=random_headers())
    33 233   soup = BeautifulSoup(req.text, 'html.parser')
    34  - result['urlTorch'] = []
    35  - pageNumber = ""
    36  - for i in soup.find("table",attrs={"width":"100%"}).findAll("small"):
    37  - if("of"in i.get_text()):
    38  - pageNumber = i.get_text()
    39  - pageNumber = round(float(clear(pageNumber.split("-")[1].split("of")[1]))/10)
    40  - if(pageNumber>99):
    41  - pageNumber=99
    42  - result['urlTorch'] = []
    43  - for n in tqdm(range(1,pageNumber+1),desc="Torch"):
    44  - urlTorch = "http://xmh57jrzrnw6insl.onion/4a1f6b371c/search.cgi?cmd=Search!&np={}&q={}".format(n,args.search)
    45  - #print(urlTorch)
    46  - try:
    47  - req = requests.get(urlTorchNumber+args.search,proxies=proxies)
     234 + 
     235 + page_number = 1
     236 + last_div = soup.find("div", attrs={"style": "text-align:center"}).find("div", attrs={"style": "text-align:center"})
     237 + if last_div is not None:
     238 + for i in last_div.find_all("a"):
     239 + page_number = int(i.get_text())
     240 + if page_number > max_nb_page:
     241 + page_number = max_nb_page
     242 + 
     243 + num_rows = 20
     244 + with tqdm(total=page_number, initial=0, desc="%20s" % "not Evil", unit="req", ascii=False, ncols=120,
     245 + bar_format=tqdm_bar_format) as progress_bar:
     246 + 
     247 + link_finder("notevil", soup)
     248 + progress_bar.update()
     249 + 
     250 + for n in range(2, page_number + 1):
     251 + start = (int(n - 1) * num_rows)
     252 + req = requests.get(notevil_url2.format(searchstr, start, num_rows),
     253 + proxies=proxies,
     254 + headers=random_headers())
    48 255   soup = BeautifulSoup(req.text, 'html.parser')
    49  - for i in soup.findAll('dl'):
    50  - result['urlTorch'].append({"name":clear(i.find('a').get_text()),"link":i.find('a')['href']})
    51  - except:
    52  - pass
     256 + link_finder("notevil", soup)
     257 + progress_bar.update()
     258 + time.sleep(1)
     259 + 
     260 + progress_bar.close()
     261 + 
     262 + 
     263 +def visitor(searchstr):
     264 + visitor_url = "http://visitorfi5kl7q7i.onion/search/?q={}&page={}"
     265 + max_nb_page = 30
     266 + if args.limit != 0:
     267 + max_nb_page = args.limit
     268 + 
     269 + bar_max = None
     270 + if args.barmode == "fixed":
     271 + bar_max = max_nb_page
     272 + with tqdm(total=bar_max, initial=0, desc="%20s" % "VisiTOR", unit="req", ascii=False, ncols=120,
     273 + bar_format=tqdm_bar_format) as progress_bar:
     274 + 
     275 + continue_processing = True
     276 + page_to_request = 1
     277 + 
     278 + with requests.Session() as s:
     279 + s.proxies = proxies
     280 + s.headers = random_headers()
     281 + 
     282 + while continue_processing:
     283 + resp = s.get(visitor_url.format(searchstr, page_to_request))
     284 + soup = BeautifulSoup(resp.text, 'html.parser')
     285 + link_finder("visitor", soup)
     286 + progress_bar.update()
     287 + 
     288 + next_page = soup.find('a', text="Next »")
     289 + if next_page is None or page_to_request >= max_nb_page:
     290 + continue_processing = False
     291 + 
     292 + page_to_request += 1
     293 + 
     294 + progress_bar.close()
     295 + 
     296 + 
     297 +def darksearchenginer(searchstr):
     298 + darksearchenginer_url = "http://7pwy57iklvt6lyhe.onion/"
     299 + max_nb_page = 20
     300 + if args.limit != 0:
     301 + max_nb_page = args.limit
     302 + page_number = 1
     303 + 
     304 + with requests.Session() as s:
     305 + s.proxies = proxies
     306 + s.headers = random_headers()
     307 + 
     308 + # Note that this search engine is very likely to timeout
     309 + resp = s.post(darksearchenginer_url, data={"search[keyword]": searchstr, "page": page_number})
     310 + soup = BeautifulSoup(resp.text, 'html.parser')
     311 + 
     312 + pages_input = soup.find_all("input", attrs={"name": "page"})
     313 + for i in pages_input:
     314 + page_number = int(i['value'])
     315 + if page_number > max_nb_page:
     316 + page_number = max_nb_page
     317 + 
     318 + with tqdm(total=page_number, initial=0, desc="%20s" % "Dark Search Enginer", unit="req", ascii=False, ncols=120,
     319 + bar_format=tqdm_bar_format) as progress_bar:
     320 + 
     321 + link_finder("darksearchenginer", soup)
     322 + progress_bar.update()
     323 + 
     324 + for n in range(2, page_number + 1):
     325 + resp = s.post(darksearchenginer_url, data={"search[keyword]": searchstr, "page": str(n)})
     326 + soup = BeautifulSoup(resp.text, 'html.parser')
     327 + link_finder("darksearchenginer", soup)
     328 + progress_bar.update()
     329 + 
     330 + progress_bar.close()
     331 + 
     332 + 
     333 +def phobos(searchstr):
     334 + phobos_url = "http://phobosxilamwcg75xt22id7aywkzol6q6rfl2flipcqoc4e4ahima5id.onion/search?query={}&p={}"
     335 + max_nb_page = 100
     336 + if args.limit != 0:
     337 + max_nb_page = args.limit
     338 + 
     339 + with requests.Session() as s:
     340 + s.proxies = proxies
     341 + s.headers = random_headers()
     342 + 
     343 + resp = s.get(phobos_url.format(searchstr, 1), proxies=proxies, headers=random_headers())
     344 + soup = BeautifulSoup(resp.text, 'html.parser')
     345 + 
     346 + page_number = 1
     347 + pages = soup.find("div", attrs={"class": "pages"}).find_all('a')
     348 + if pages is not None:
     349 + for i in pages:
     350 + page_number = int(i.get_text())
     351 + if page_number > max_nb_page:
     352 + page_number = max_nb_page
     353 + 
     354 + with tqdm(total=page_number, initial=0, desc="%20s" % "Phobos", unit="req", ascii=False, ncols=120,
     355 + bar_format=tqdm_bar_format) as progress_bar:
     356 + 
     357 + link_finder("phobos", soup)
     358 + progress_bar.update()
     359 + 
     360 + for n in range(2, page_number + 1):
     361 + resp = s.get(phobos_url.format(searchstr, n), proxies=proxies, headers=random_headers())
     362 + soup = BeautifulSoup(resp.text, 'html.parser')
     363 + link_finder("phobos", soup)
     364 + progress_bar.update()
     365 + 
     366 + progress_bar.close()
     367 + 
     368 + 
     369 +def onionsearchserver(searchstr):
     370 + onionsearchserver_url1 = "http://oss7wrm7xvoub77o.onion/oss/"
     371 + onionsearchserver_url2 = None
     372 + results_per_page = 10
     373 + max_nb_page = 100
     374 + if args.limit != 0:
     375 + max_nb_page = args.limit
     376 + 
     377 + with requests.Session() as s:
     378 + s.proxies = proxies
     379 + s.headers = random_headers()
     380 + 
     381 + resp = s.get(onionsearchserver_url1)
     382 + soup = BeautifulSoup(resp.text, 'html.parser')
     383 + for i in soup.find_all('iframe', attrs={"style": "display:none;"}):
     384 + onionsearchserver_url2 = i['src'] + "{}&page={}"
     385 + 
     386 + if onionsearchserver_url2 is None:
     387 + return -1
     388 + 
     389 + resp = s.get(onionsearchserver_url2.format(searchstr, 1))
     390 + soup = BeautifulSoup(resp.text, 'html.parser')
     391 + 
     392 + page_number = 1
     393 + pages = soup.find_all("div", attrs={"class": "osscmnrdr ossnumfound"})
     394 + if pages is not None and not str(pages[0].get_text()).startswith("No"):
     395 + total_results = float(str.split(clear(pages[0].get_text()))[0])
     396 + page_number = math.ceil(total_results / results_per_page)
     397 + if page_number > max_nb_page:
     398 + page_number = max_nb_page
     399 + 
     400 + with tqdm(total=page_number, initial=0, desc="%20s" % "Onion Search Server", unit="req", ascii=False, ncols=120,
     401 + bar_format=tqdm_bar_format) as progress_bar:
     402 + 
     403 + link_finder("onionsearchserver", soup)
     404 + progress_bar.update()
     405 + 
     406 + for n in range(2, page_number + 1):
     407 + resp = s.get(onionsearchserver_url2.format(searchstr, n))
     408 + soup = BeautifulSoup(resp.text, 'html.parser')
     409 + link_finder("onionsearchserver", soup)
     410 + progress_bar.update()
     411 + 
     412 + progress_bar.close()
     413 + 
     414 + 
     415 +def grams(searchstr):
     416 + # No multi pages handling as it is very hard to get many results on this engine
     417 + grams_url1 = "http://grams7enqfy4nieo.onion/"
     418 + grams_url2 = "http://grams7enqfy4nieo.onion/results"
     419 + 
     420 + with requests.Session() as s:
     421 + s.proxies = proxies
     422 + s.headers = random_headers()
     423 + 
     424 + resp = s.get(grams_url1)
     425 + soup = BeautifulSoup(resp.text, 'html.parser')
     426 + token = soup.find('input', attrs={'name': '_token'})['value']
     427 + 
     428 + with tqdm(total=1, initial=0, desc="%20s" % "Grams", unit="req", ascii=False, ncols=120,
     429 + bar_format=tqdm_bar_format) as progress_bar:
     430 + 
     431 + resp = s.post(grams_url2, data={"req": searchstr, "_token": token})
     432 + soup = BeautifulSoup(resp.text, 'html.parser')
     433 + link_finder("grams", soup)
     434 + progress_bar.update()
     435 + progress_bar.close()
     436 + 
     437 + 
     438 +def candle(searchstr):
     439 + candle_url = "http://gjobjn7ievumcq6z.onion/?q={}"
     440 + 
     441 + with tqdm(total=1, initial=0, desc="%20s" % "Candle", unit="req", ascii=False, ncols=120,
     442 + bar_format=tqdm_bar_format) as progress_bar:
     443 + response = requests.get(candle_url.format(searchstr), proxies=proxies, headers=random_headers())
     444 + soup = BeautifulSoup(response.text, 'html.parser')
     445 + link_finder("candle", soup)
     446 + progress_bar.update()
     447 + progress_bar.close()
     448 + 
     449 + 
     450 +def torsearchengine(searchstr):
     451 + torsearchengine_url = "http://searchcoaupi3csb.onion/search/move/?q={}&pn={}&num=10&sdh=&"
     452 + max_nb_page = 100
     453 + if args.limit != 0:
     454 + max_nb_page = args.limit
     455 + 
     456 + with requests.Session() as s:
     457 + s.proxies = proxies
     458 + s.headers = random_headers()
     459 + 
     460 + resp = s.get(torsearchengine_url.format(searchstr, 1))
     461 + soup = BeautifulSoup(resp.text, 'html.parser')
     462 + 
     463 + page_number = 1
     464 + for i in soup.find_all('div', attrs={"id": "subheader"}):
     465 + if i.get_text() is not None and "of" in i.get_text():
     466 + total_results = int(i.find('p').find_all('b')[2].get_text().replace(",", ""))
     467 + results_per_page = 10
     468 + page_number = math.ceil(total_results / results_per_page)
     469 + if page_number > max_nb_page:
     470 + page_number = max_nb_page
     471 + 
     472 + with tqdm(total=page_number, initial=0, desc="%20s" % "Tor Search Engine", unit="req", ascii=False, ncols=120,
     473 + bar_format=tqdm_bar_format) as progress_bar:
     474 + 
     475 + link_finder("torsearchengine", soup)
     476 + progress_bar.update()
     477 + 
     478 + for n in range(2, page_number + 1):
     479 + resp = s.get(torsearchengine_url.format(searchstr, n))
     480 + soup = BeautifulSoup(resp.text, 'html.parser')
     481 + ret = link_finder("torsearchengine", soup)
     482 + progress_bar.update()
     483 + 
     484 + progress_bar.close()
     485 + 
     486 + 
     487 +def torgle(searchstr):
     488 + torgle_url = "http://torglejzid2cyoqt.onion/search.php?term={}"
     489 + 
     490 + with tqdm(total=1, initial=0, desc="%20s" % "Torgle", unit="req", ascii=False, ncols=120,
     491 + bar_format=tqdm_bar_format) as progress_bar:
     492 + response = requests.get(torgle_url.format(searchstr), proxies=proxies, headers=random_headers())
     493 + soup = BeautifulSoup(response.text, 'html.parser')
     494 + link_finder("torgle", soup)
     495 + progress_bar.update()
     496 + progress_bar.close()
    53 497   
    54  - darksearchnumber = "http://darksearch.io/api/search?query="
    55  - req = requests.get(darksearchnumber+args.search,proxies=proxies)
    56  - cookies = req.cookies
    57  - if(req.status_code==200):
    58  - result['darksearch']=[]
    59  - #print(req)
    60  - req = req.json()
    61  - if(req['last_page']>30):
    62  - pageNumber=30
    63  - else:
    64  - pageNumber=req['last_page']
    65  - #print(pageNumber)
    66  - for i in tqdm(range(1,pageNumber+1),desc="Darksearch io"):
    67  - #print(i)
    68  - darksearch = "http://darksearch.io/api/search?query={}&page=".format(args.search)
    69  - req = requests.get(darksearch+str(pageNumber),proxies=proxies,cookies=cookies)
    70  - if(req.status_code==200):
    71  - for r in req.json()['data']:
    72  - result['darksearch'].append({"name":r["title"],"link":r["link"]})
    73 498   
     499 +def onionsearchengine(searchstr):
     500 + onionsearchengine_url = "http://onionf4j3fwqpeo5.onion/search.php?search={}&submit=Search&page={}"
     501 + max_nb_page = 100
     502 + if args.limit != 0:
     503 + max_nb_page = args.limit
     504 + 
     505 + with requests.Session() as s:
     506 + s.proxies = proxies
     507 + s.headers = random_headers()
     508 + 
     509 + resp = s.get(onionsearchengine_url.format(searchstr, 1))
     510 + soup = BeautifulSoup(resp.text, 'html.parser')
     511 + 
     512 + page_number = 1
     513 + approx_re = re.search(r"\s([0-9]+)\sresult[s]?\sfound\s!.*", clear(soup.find('body').get_text()))
     514 + if approx_re is not None:
     515 + nb_res = int(approx_re.group(1))
     516 + results_per_page = 9
     517 + page_number = math.ceil(float(nb_res / results_per_page))
     518 + if page_number > max_nb_page:
     519 + page_number = max_nb_page
     520 + 
     521 + with tqdm(total=page_number, initial=0, desc="%20s" % "Onion Search Engine", unit="req", ascii=False, ncols=120,
     522 + bar_format=tqdm_bar_format) as progress_bar:
     523 + 
     524 + link_finder("onionsearchengine", soup)
     525 + progress_bar.update()
     526 + 
     527 + for n in range(2, page_number + 1):
     528 + resp = s.get(onionsearchengine_url.format(searchstr, n))
     529 + soup = BeautifulSoup(resp.text, 'html.parser')
     530 + link_finder("onionsearchengine", soup)
     531 + progress_bar.update()
     532 + 
     533 + progress_bar.close()
     534 + 
     535 + 
     536 +def tordex(searchstr):
     537 + tordex_url = "http://tordex7iie7z2wcg.onion/search?query={}&page={}"
     538 + max_nb_page = 100
     539 + if args.limit != 0:
     540 + max_nb_page = args.limit
     541 + 
     542 + with requests.Session() as s:
     543 + s.proxies = proxies
     544 + s.headers = random_headers()
     545 + 
     546 + resp = s.get(tordex_url.format(searchstr, 1))
     547 + soup = BeautifulSoup(resp.text, 'html.parser')
     548 + 
     549 + page_number = 1
     550 + pages = soup.find_all("li", attrs={"class": "page-item"})
     551 + if pages is not None:
     552 + for i in pages:
     553 + if i.get_text() != "...":
     554 + page_number = int(i.get_text())
     555 + if page_number > max_nb_page:
     556 + page_number = max_nb_page
     557 + 
     558 + with tqdm(total=page_number, initial=0, desc="%20s" % "Tordex", unit="req", ascii=False, ncols=120,
     559 + bar_format=tqdm_bar_format) as progress_bar:
     560 + 
     561 + link_finder("tordex", soup)
     562 + progress_bar.update()
     563 + 
     564 + for n in range(2, page_number + 1):
     565 + resp = s.get(tordex_url.format(searchstr, n))
     566 + soup = BeautifulSoup(resp.text, 'html.parser')
     567 + link_finder("tordex", soup)
     568 + progress_bar.update()
     569 + 
     570 + progress_bar.close()
     571 + 
     572 + 
     573 +def tor66(searchstr):
     574 + tor66_url = "http://tor66sezptuu2nta.onion/search?q={}&sorttype=rel&page={}"
     575 + max_nb_page = 30
     576 + if args.limit != 0:
     577 + max_nb_page = args.limit
     578 + 
     579 + with requests.Session() as s:
     580 + s.proxies = proxies
     581 + s.headers = random_headers()
     582 + 
     583 + resp = s.get(tor66_url.format(searchstr, 1))
     584 + soup = BeautifulSoup(resp.text, 'html.parser')
     585 + 
     586 + page_number = 1
     587 + approx_re = re.search(r"\.Onion\ssites\sfound\s:\s([0-9]+)",
     588 + resp.text)
     589 + if approx_re is not None:
     590 + nb_res = int(approx_re.group(1))
     591 + results_per_page = 20
     592 + page_number = math.ceil(float(nb_res / results_per_page))
     593 + if page_number > max_nb_page:
     594 + page_number = max_nb_page
     595 + 
     596 + with tqdm(total=page_number, initial=0, desc="%20s" % "Tor66", unit="req", ascii=False, ncols=120,
     597 + bar_format=tqdm_bar_format) as progress_bar:
     598 + 
     599 + link_finder("tor66", soup)
     600 + progress_bar.update()
     601 + 
     602 + for n in range(2, page_number + 1):
     603 + resp = s.get(tor66_url.format(searchstr, n))
     604 + soup = BeautifulSoup(resp.text, 'html.parser')
     605 + link_finder("tor66", soup)
     606 + progress_bar.update()
     607 + 
     608 + progress_bar.close()
     609 + 
     610 + 
     611 +def tormax(searchstr):
     612 + tormax_url = "http://tormaxunodsbvtgo.onion/tormax/search?q={}"
     613 + 
     614 + with tqdm(total=1, initial=0, desc="%20s" % "Tormax", unit="req", ascii=False, ncols=120,
     615 + bar_format=tqdm_bar_format) as progress_bar:
     616 + response = requests.get(tormax_url.format(searchstr), proxies=proxies, headers=random_headers())
     617 + soup = BeautifulSoup(response.text, 'html.parser')
     618 + link_finder("tormax", soup)
     619 + progress_bar.update()
     620 + progress_bar.close()
     621 + 
     622 + 
     623 +def haystack(searchstr):
     624 + haystack_url = "http://haystakvxad7wbk5.onion/?q={}&offset={}"
     625 + # At the 52nd page, it timeouts 100% of the time
     626 + max_nb_page = 50
     627 + if args.limit != 0:
     628 + max_nb_page = args.limit
     629 + offset_coeff = 20
     630 + 
     631 + with requests.Session() as s:
     632 + s.proxies = proxies
     633 + s.headers = random_headers()
     634 + 
     635 + req = s.get(haystack_url.format(searchstr, 0))
     636 + soup = BeautifulSoup(req.text, 'html.parser')
     637 + 
     638 + bar_max = None
     639 + if args.barmode == "fixed":
     640 + bar_max = max_nb_page
     641 + with tqdm(total=bar_max, initial=0, desc="%20s" % "Haystack", unit="req", ascii=False, ncols=120,
     642 + bar_format=tqdm_bar_format) as progress_bar:
     643 + 
     644 + continue_processing = True
     645 + 
     646 + ret = link_finder("haystack", soup)
     647 + if ret < 0:
     648 + continue_processing = False
     649 + progress_bar.update()
     650 + 
     651 + it = 1
     652 + while continue_processing:
     653 + offset = int(it * offset_coeff)
     654 + req = s.get(haystack_url.format(searchstr, offset))
     655 + soup = BeautifulSoup(req.text, 'html.parser')
     656 + ret = link_finder("haystack", soup)
     657 + progress_bar.update()
     658 + it += 1
     659 + if it >= max_nb_page or ret < 0:
     660 + continue_processing = False
     661 + 
     662 + 
     663 +def multivac(searchstr):
     664 + multivac_url = "http://multivacigqzqqon.onion/?q={}&page={}"
     665 + max_nb_page = 10
     666 + if args.limit != 0:
     667 + max_nb_page = args.limit
     668 + 
     669 + with requests.Session() as s:
     670 + s.proxies = proxies
     671 + s.headers = random_headers()
     672 + 
     673 + page_to_request = 1
     674 + req = s.get(multivac_url.format(searchstr, page_to_request))
     675 + soup = BeautifulSoup(req.text, 'html.parser')
     676 + 
     677 + bar_max = None
     678 + if args.barmode == "fixed":
     679 + bar_max = max_nb_page
     680 + with tqdm(total=bar_max, initial=0, desc="%20s" % "Multivac", unit="req", ascii=False, ncols=120,
     681 + bar_format=tqdm_bar_format) as progress_bar:
     682 + 
     683 + continue_processing = True
     684 + 
     685 + ret = link_finder("multivac", soup)
     686 + if ret < 0 or page_to_request >= max_nb_page:
     687 + continue_processing = False
     688 + progress_bar.update()
     689 + 
     690 + while continue_processing:
     691 + page_to_request += 1
     692 + req = s.get(multivac_url.format(searchstr, page_to_request))
     693 + soup = BeautifulSoup(req.text, 'html.parser')
     694 + ret = link_finder("multivac", soup)
     695 + progress_bar.update()
     696 + 
     697 + if page_to_request >= max_nb_page or ret < 0:
     698 + continue_processing = False
     699 + 
     700 + 
     701 +def evosearch(searchstr):
     702 + evosearch_url = "http://evo7no6twwwrm63c.onion/evo/search.php?" \
     703 + "query={}&" \
     704 + "start={}&" \
     705 + "search=1&type=and&mark=bold+text&" \
     706 + "results={}"
     707 + results_per_page = 50
     708 + max_nb_page = 30
     709 + if args.limit != 0:
     710 + max_nb_page = args.limit
     711 + 
     712 + with requests.Session() as s:
     713 + s.proxies = proxies
     714 + s.headers = random_headers()
     715 + 
     716 + req = s.get(evosearch_url.format(searchstr, 1, results_per_page))
     717 + soup = BeautifulSoup(req.text, 'html.parser')
     718 + 
     719 + page_number = 1
     720 + i = soup.find("p", attrs={"class": "cntr"})
     721 + if i is not None:
     722 + if i.get_text() is not None and "of" in i.get_text():
     723 + nb_res = float(clear(str.split(i.get_text().split("-")[1].split("of")[1])[0]))
     724 + # The results page loads in two times, it is hard not to lose the second part
     725 + page_number = math.ceil(nb_res / (results_per_page / 2))
     726 + if page_number > max_nb_page:
     727 + page_number = max_nb_page
     728 + 
     729 + with tqdm(total=page_number, initial=0, desc="%20s" % "Evo Search", unit="req", ascii=False, ncols=120,
     730 + bar_format=tqdm_bar_format) as progress_bar:
     731 + 
     732 + link_finder("evosearch", soup)
     733 + progress_bar.update()
     734 + 
     735 + for n in range(2, page_number + 1):
     736 + resp = s.get(evosearch_url.format(searchstr, n, results_per_page))
     737 + soup = BeautifulSoup(resp.text, 'html.parser')
     738 + link_finder("evosearch", soup)
     739 + progress_bar.update()
     740 + 
     741 + progress_bar.close()
     742 + 
     743 + 
     744 +def oneirun(searchstr):
     745 + oneirun_url = "http://oneirunda366dmfm.onion/Home/IndexEn"
     746 + 
     747 + with requests.Session() as s:
     748 + s.proxies = proxies
     749 + s.headers = random_headers()
     750 + 
     751 + resp = s.get(oneirun_url)
     752 + soup = BeautifulSoup(resp.text, 'html.parser')
     753 + token = soup.find('input', attrs={"name": "__RequestVerificationToken"})['value']
     754 + 
     755 + with tqdm(total=1, initial=0, desc="%20s" % "Oneirun", unit="req", ascii=False, ncols=120,
     756 + bar_format=tqdm_bar_format) as progress_bar:
     757 + response = s.post(oneirun_url.format(searchstr),
     758 + data={"searchString": searchstr, "__RequestVerificationToken": token})
     759 + soup = BeautifulSoup(response.text, 'html.parser')
     760 + link_finder("oneirun", soup)
     761 + progress_bar.update()
     762 + progress_bar.close()
     763 + 
     764 + 
     765 +def deeplink(searchstr):
     766 + deeplink_url1 = "http://deeplinkdeatbml7.onion/index.php"
     767 + deeplink_url2 = "http://deeplinkdeatbml7.onion/?search={}&type=verified"
     768 + 
     769 + with requests.Session() as s:
     770 + s.proxies = proxies
     771 + s.headers = random_headers()
     772 + resp = s.get(deeplink_url1)
     773 + 
     774 + with tqdm(total=1, initial=0, desc="%20s" % "DeepLink", unit="req", ascii=False, ncols=120,
     775 + bar_format=tqdm_bar_format) as progress_bar:
     776 + response = s.get(deeplink_url2.format(searchstr))
     777 + soup = BeautifulSoup(response.text, 'html.parser')
     778 + link_finder("deeplink", soup)
     779 + progress_bar.update()
     780 + progress_bar.close()
     781 + 
     782 + 
     783 +def link_finder(engine_str, data_obj):
     784 + global result
     785 + name = ""
     786 + link = ""
     787 + 
     788 + def append_link():
     789 + result[engine_str].append({"name": name, "link": link})
     790 + 
     791 + if engine_str not in result:
     792 + result[engine_str] = []
     793 + 
     794 + if engine_str == "ahmia":
     795 + for i in data_obj.find_all('li', attrs={'class': 'result'}):
     796 + i = i.find('h4')
     797 + name = clear(i.get_text())
     798 + link = i.find('a')['href'].replace("/search/search/redirect?search_term={}&redirect_url="
     799 + .format(args.search), "")
     800 + append_link()
     801 + 
     802 + if engine_str == "candle":
     803 + for i in data_obj.find('html').find_all('a'):
     804 + if str(i['href']).startswith("http"):
     805 + name = clear(i.get_text())
     806 + link = clear(i['href'])
     807 + append_link()
     808 + 
     809 + if engine_str == "darksearchenginer":
     810 + for i in data_obj.find('div', attrs={"class": "table-responsive"}).find_all('a'):
     811 + name = clear(i.get_text())
     812 + link = clear(i['href'])
     813 + append_link()
     814 + 
     815 + if engine_str == "darksearchio":
     816 + for r in data_obj:
     817 + name = clear(r["title"])
     818 + link = clear(r["link"])
     819 + append_link()
     820 + 
     821 + if engine_str == "deeplink":
     822 + for tr in data_obj.find_all('tr'):
     823 + cels = tr.find_all('td')
     824 + if cels is not None and len(cels) == 4:
     825 + name = clear(cels[1].get_text())
     826 + link = clear(cels[0].find('a')['href'])
     827 + append_link()
     828 + 
     829 + if engine_str == "evosearch":
     830 + if data_obj.find('div', attrs={"id": "results"}) is not None:
     831 + count = 0
     832 + for div in data_obj.find('div', attrs={"id": "results"}).find_all('div', attrs={"class": "odrow"}):
     833 + name = clear(div.find('div', attrs={"class": "title"}).find('a').get_text())
     834 + link = clear(div.find('div', attrs={"class": "title"}).find('a')['href']
     835 + .replace("./include/click_counter.php?url=", "")
     836 + .replace("&query={}".format(args.search), ""))
     837 + count += 1
     838 + append_link()
     839 + 
     840 + if engine_str == "grams":
     841 + for i in data_obj.find_all("div", attrs={"class": "media-body"}):
     842 + if not i.find('span'):
     843 + for j in i.find_all('a'):
     844 + if str(j.get_text()).startswith("http"):
     845 + link = j.get_text()
     846 + else:
     847 + name = j.get_text()
     848 + append_link()
     849 + 
     850 + if engine_str == "haystack":
     851 + if data_obj.find('div', attrs={"class": "result"}) is None:
     852 + return -1
     853 + for div in data_obj.find_all('div', attrs={"class": "result"}):
     854 + if div.find('a') is not None and div.find('i') is not None:
     855 + name = clear(div.find('a').get_text())
     856 + link = clear(div.find('i').get_text())
     857 + append_link()
     858 + 
     859 + if engine_str == "multivac":
     860 + for i in data_obj.find_all('dl'):
     861 + link_tag = i.find('a')
     862 + if link_tag:
     863 + if link_tag['href'] != "":
     864 + name = clear(link_tag.get_text())
     865 + link = clear(link_tag['href'])
     866 + append_link()
     867 + else:
     868 + return -1
     869 + 
     870 + if engine_str == "notevil":
     871 + ''' As for OnionLand, we could use the span instead of the href to get a beautiful link
     872 + However some useful links are shown under the "li" tag,
     873 + and there we would not be able to have a sanitized version
     874 + Thus, the best is to implement a generic sanitize function. '''
     875 + for i in data_obj.find_all('p'):
     876 + name = clear(i.find('a').get_text())
     877 + link = i.find('a')['href'].replace("./r2d.php?url=", "")
     878 + append_link()
     879 + for i in data_obj.find_all('li'):
     880 + name = clear(i.find('a').get_text())
     881 + link = i.find('a')['href'].replace("./r2d.php?url=", "")
     882 + append_link()
     883 + 
     884 + if engine_str == "oneirun":
     885 + for td in data_obj.find_all('td', attrs={"style": "vertical-align: top;"}):
     886 + name = clear(td.find('h5').get_text())
     887 + link = clear(td.find('a')['href'])
     888 + append_link()
     889 + 
     890 + if engine_str == "onionland":
     891 + if data_obj.find('div', attrs={"class": "row no-result-row"}):
     892 + return -1
     893 + for i in data_obj.find_all('div', attrs={"class": "result-block"}):
     894 + if not str(clear(i.find('div', attrs={'class': "title"}).find('a')['href'])).startswith("/ads"):
     895 + name = clear(i.find('div', attrs={'class': "title"}).get_text())
     896 + link = clear(i.find('div', attrs={'class': "link"}).get_text())
     897 + append_link()
     898 + 
     899 + if engine_str == "onionsearchengine":
     900 + for i in data_obj.find_all('table'):
     901 + for j in i.find_all('a'):
     902 + if str(j['href']).startswith("url.php?u=") and not str(j.get_text()).startswith("http://"):
     903 + name = clear(j.get_text())
     904 + link = clear(str(j['href']).replace("url.php?u=", ""))
     905 + append_link()
     906 + 
     907 + if engine_str == "onionsearchserver":
     908 + for i in data_obj.find_all('div', attrs={"class": "osscmnrdr ossfieldrdr1"}):
     909 + name = clear(i.find('a').get_text())
     910 + link = clear(i.find('a')['href'])
     911 + append_link()
     912 + 
     913 + if engine_str == "phobos":
     914 + links = data_obj.find('div', attrs={"class": "serp"}).find_all('a', attrs={"class": "titles"})
     915 + for i in links:
     916 + name = clear(i.get_text())
     917 + link = clear(i['href'])
     918 + append_link()
     919 + 
     920 + if engine_str == "tor66":
     921 + for i in data_obj.find('hr').find_all_next('b'):
     922 + if i.find('a'):
     923 + name = clear(i.find('a').get_text())
     924 + link = clear(i.find('a')['href'])
     925 + append_link()
     926 + 
     927 + if engine_str == "torch":
     928 + for i in data_obj.find_all('dl'):
     929 + name = clear(i.find('a').get_text())
     930 + link = i.find('a')['href']
     931 + append_link()
     932 + 
     933 + if engine_str == "tordex":
     934 + for i in data_obj.find_all('div', attrs={"class": "result mb-3"}):
     935 + a_link = i.find('h5').find('a')
     936 + name = clear(a_link.get_text())
     937 + link = clear(a_link['href'])
     938 + append_link()
     939 + 
     940 + if engine_str == "torgle":
     941 + for i in data_obj.find_all('ul', attrs={"id": "page"}):
     942 + for j in i.find_all('a'):
     943 + if str(j.get_text()).startswith("http"):
     944 + link = clear(j.get_text())
     945 + else:
     946 + name = clear(j.get_text())
     947 + append_link()
     948 + 
     949 + if engine_str == "tormax":
     950 + for i in data_obj.find_all('article'):
     951 + if i.find('a') is not None and i.find('div') is not None:
     952 + link = clear(i.find('div', attrs={"class": "url"}).get_text())
     953 + name = clear(i.find('a', attrs={"class": "title"}).get_text())
     954 + append_link()
     955 + 
     956 + if engine_str == "torsearchengine":
     957 + for i in data_obj.find_all('h3', attrs={'class': 'title text-truncate'}):
     958 + name = clear(i.find('a').get_text())
     959 + link = i.find('a')['data-uri']
     960 + append_link()
     961 + 
     962 + if engine_str == "visitor":
     963 + li_tags = data_obj.find_all('li', attrs={'class': 'hs_site'})
     964 + for i in li_tags:
     965 + h3tags = i.find_all('h3')
     966 + for n in h3tags:
     967 + name = clear(n.find('a').get_text())
     968 + link = n.find('a')['href']
     969 + append_link()
     970 + 
     971 + return 1
     972 + 
     973 + 
     974 +def call_func_as_str(function_name, function_arg):
     975 + try:
     976 + globals()[function_name](function_arg)
     977 + except ConnectionError:
     978 + print("Error: unable to connect")
     979 + except OSError:
     980 + print("Error: unable to connect")
     981 + 
     982 + 
     983 +def scrape():
     984 + global result
     985 + 
     986 + start_time = datetime.now()
     987 + 
     988 + if args.engines and len(args.engines) > 0:
     989 + engines = args.engines[0]
     990 + for e in engines:
     991 + try:
     992 + if not (args.exclude and len(args.exclude) > 0 and e in args.exclude[0]):
     993 + call_func_as_str(e, args.search)
     994 + except KeyError:
     995 + print("Error: search engine {} not in the list of supported engines".format(e))
    74 996   else:
    75  - print("Rate limit darksearch.io !")
     997 + for e in supported_engines:
     998 + if not (args.exclude and len(args.exclude) > 0 and e in args.exclude[0]):
     999 + call_func_as_str(e, args.search)
    76 1000   
    77  - result['onionland'] = []
    78  - for n in tqdm(range(1,400),desc="OnionLand"):
    79  - onionland = "http://3bbaaaccczcbdddz.onion/search?q={}&page={}".format(args.search,n)
    80  - #print(urlTorch)
    81  - req = requests.get(onionland,proxies=proxies)
    82  - if(req.status_code==200):
    83  - soup = BeautifulSoup(req.text, 'html.parser')
    84  - for i in soup.findAll('div',attrs={"class":"result-block"}):
    85  - if('''<span class="label-ad">Ad</span>''' not in i):
    86  - #print({"name":i.find('div',attrs={'class':"title"}).get_text(),"link":clear(i.find('div',attrs={'class':"link"}).get_text())})
    87  - result['onionland'].append({"name":i.find('div',attrs={'class':"title"}).get_text(),"link":clear(i.find('div',attrs={'class':"link"}).get_text())})
    88  - else:
    89  - break
     1001 + stop_time = datetime.now()
     1002 + 
     1003 + total = 0
     1004 + print("\nReport:")
     1005 + print(" Execution time: %s seconds" % (stop_time - start_time))
    90 1006   
    91  - print("Ahmia : " + str(len(result['ahmia'])))
    92  - print("Torch : "+str(len(result['urlTorch'])))
    93  - print("Darksearch io : "+str(len(result['darksearch'])))
    94  - print("Onionland : "+str(len(result['onionland'])))
    95  - print("Total of {} links !\nExported to {}".format(str(len(result['ahmia'])+len(result['urlTorch'])+len(result['darksearch'])+len(result['onionland'])),args.output))
    96  - f= open(args.output,"w+")
    97  - for i in result['urlTorch']:
    98  - f.write("name : {} link: {}\n".format(clearn(i["name"]),i["link"]))
    99  - for i in result['onionland']:
    100  - f.write("name: {} link : {}\n".format(clearn(i["name"]),i["link"]))
    101  - for i in result['ahmia']:
    102  - f.write("name : {} link : {}\n".format(clearn(i["name"]),i["link"]))
    103  - for i in result['darksearch']:
    104  - f.write("name : {} link : {}\n".format(clearn(i["name"]),i["link"]))
     1007 + f = open(args.output, "w+")
     1008 + for engine in result.keys():
     1009 + print(" {}: {}".format(engine, str(len(result[engine]))))
     1010 + total += len(result[engine])
     1011 + for i in result[engine]:
     1012 + f.write("\"{}\",\"{}\",\"{}\"\n".format(engine, i["name"], i["link"]))
    105 1013   
    106 1014   f.close()
    107  -scrape()
     1015 + print(" Total: {} links written to {}".format(str(total), args.output))
     1016 + 
     1017 + 
     1018 +if __name__ == "__main__":
     1019 + scrape()
    108 1020   
Please wait...
Page is in error, reload to recover