STRLCPY/OnionSearch

Update
megadose committed 4 years ago

707c3c64

1 parent 1eca1c8e

Revision indexing in progress... (symbol navigation in revisions will be accurate after indexed)

■ ■ ■ ■ ■ ■

README.md

1	1		# OnionSearch
2		-	# This new version is made by [Gobarigo](https://github.com/Gobarigo) so many thanks to him!
3	2		## Educational purposes only
4	3		[![forthebadge made-with-python](http://ForTheBadge.com/images/badges/made-with-python.svg)](https://www.python.org/)
5	4
		skipped 5 lines
11	10		[Python 3](https://www.python.org/download/releases/3.0/)
12	11
13	12		## 📚 Currently supported Search engines
14		-	- Ahmia
15		-	- TORCH (x2)
16		-	- Darksearch io
17		-	- OnionLand
18		-	- not Evil
19		-	- VisiTOR
20		-	- Dark Search Enginer
21		-	- Phobos
22		-	- Onion Search Server
23		-	- Grams (x2)
24		-	- Candle
25		-	- Tor Search Engine (x2)
26		-	- Torgle (x2)
27		-	- Onion Search Engine
28		-	- Tordex
29		-	- Tor66
30		-	- Tormax
31		-	- Haystack
32		-	- Multivac
33		-	- Evo Search
34		-	- Oneirun
35		-	- DeepLink
	13	+	## Modules :
	14	+	- ahmia
	15	+	- darksearchio
	16	+	- onionland
	17	+	- notevil
	18	+	- darksearchenginer
	19	+	- phobos
	20	+	- onionsearchserver
	21	+	- torgle x2
	22	+	- onionsearchengine
	23	+	- tordex
	24	+	- tor66
	25	+	- tormax
	26	+	- haystack
	27	+	- multivac
	28	+	- evosearch
	29	+	- deeplink
36	30
37	31		## 🛠️ Installation
	32	+	### With PyPI
38	33
39		-	```
	34	+	```pip3 install onionsearch```
	35	+
	36	+	### With Github
	37	+
	38	+	```bash
40	39		git clone https://github.com/megadose/OnionSearch.git
41		-	cd OnionSearch
42		-	pip3 install -r requirements.txt
43		-	pip3 install 'urllib3[socks]'
44		-	python3 search.py -h
	40	+	cd OnionSearch/
	41	+	python3 setup.py install
45	42		```
	43	+
46	44
47	45		## 📈 Usage
48	46
49	47		Help:
50	48		```
51		-	usage: search.py [-h] [--proxy PROXY] [--output OUTPUT]
52		-	[--continuous_write CONTINUOUS_WRITE] [--limit LIMIT]
53		-	[--engines [ENGINES [ENGINES ...]]]
54		-	[--exclude [EXCLUDE [EXCLUDE ...]]]
55		-	[--fields [FIELDS [FIELDS ...]]]
56		-	[--field_delimiter FIELD_DELIMITER] [--mp_units MP_UNITS]
57		-	search
	49	+	usage: onionsearch [-h] [--proxy PROXY] [--output OUTPUT]
	50	+	[--continuous_write CONTINUOUS_WRITE] [--limit LIMIT]
	51	+	[--engines [ENGINES [ENGINES ...]]]
	52	+	[--exclude [EXCLUDE [EXCLUDE ...]]]
	53	+	[--fields [FIELDS [FIELDS ...]]]
	54	+	[--field_delimiter FIELD_DELIMITER] [--mp_units MP_UNITS]
	55	+	search
58	56
59	57		positional arguments:
60	58		search The search string or phrase
		skipped 34 lines
95	93
96	94		To request all the engines for the word "computer":
97	95		```
98		-	python3 search.py "computer"
	96	+	onionsearch "computer"
99	97		```
100	98
101	99		To request all the engines excepted "Ahmia" and "Candle" for the word "computer":
102	100		```
103		-	python3 search.py "computer" --exclude ahmia candle
	101	+	onionsearch "computer" --exclude ahmia candle
104	102		```
105	103
106	104		To request only "Tor66", "DeepLink" and "Phobos" for the word "computer":
107	105		```
108		-	python3 search.py "computer" --engines tor66 deeplink phobos
	106	+	onionsearch "computer" --engines tor66 deeplink phobos
109	107		```
110	108
111	109		The same as previously but limiting to 3 the number of pages to load per engine:
112	110		```
113		-	python3 search.py "computer" --engines tor66 deeplink phobos --limit 3
	111	+	onionsearch "computer" --engines tor66 deeplink phobos --limit 3
114	112		```
115	113
116	114		Please kindly note that the list of supported engines (and their keys) is given in the script help (-h).
		skipped 36 lines
153	151
154	152		You can modify this filename by using `--output` when running the script, for instance:
155	153		```
156		-	python3 search.py "computer" --output "\$DATE.csv"
157		-	python3 search.py "computer" --output output.txt
158		-	python3 search.py "computer" --output "\$DATE_\$SEARCH.csv"
	154	+	onionsearch "computer" --output "\$DATE.csv"
	155	+	onionsearch "computer" --output output.txt
	156	+	onionsearch "computer" --output "\$DATE_\$SEARCH.csv"
159	157		...
160	158		```
161	159		(Note that it might be necessary to escape the dollar character.)
		skipped 5 lines
167	165		You can choose to progressively write to the output (instead of everything at the end, which would prevent
168	166		losing the results if something goes wrong). To do so you have to use `--continuous_write True`, just as is:
169	167		```
170		-	python3 search.py "computer" --continuous_write True
	168	+	onionsearch "computer" --continuous_write True
171	169		```
172	170		You can then use the `tail -f` (tail follow) Unix command to actively watch or monitor the results of the scraping.
	171	+	## Thank you to [Gobarigo](https://github.com/Gobarigo)
	172	+
173	173
174	174		## 📝 License
175	175		[GNU General Public License v3.0](https://www.gnu.org/licenses/gpl-3.0.fr.html)
176	176
177		-
178		-

■ ■ ■ ■ ■ ■

engines.py

1	-	ENGINES = {
2	-	"ahmia": "http://msydqstlz2kzerdg.onion",
3	-	"torch": "http://xmh57jrzrnw6insl.onion",
4	-	"torch1": "http://mkojmtnv22hpbfxk.onion",
5	-	"darksearchio": "http://darksearch.io",
6	-	"onionland": "http://3bbad7fauom4d6sgppalyqddsqbf5u5p56b5k5uk2zxsy3d6ey2jobad.onion",
7	-	"notevil": "http://hss3uro2hsxfogfq.onion",
8	-	"visitor": "http://visitorfi5kl7q7i.onion",
9	-	"darksearchenginer": "http://7pwy57iklvt6lyhe.onion",
10	-	"phobos": "http://phobosxilamwcg75xt22id7aywkzol6q6rfl2flipcqoc4e4ahima5id.onion",
11	-	"onionsearchserver": "http://oss7wrm7xvoub77o.onion",
12	-	"grams": "http://grams7enqfy4nieo.onion",
13	-	"grams1": "http://grams7ebnju7gwjl.onion",
14	-	"candle": "http://gjobjn7ievumcq6z.onion",
15	-	"torsearchengine": "http://searchcoaupi3csb.onion",
16	-	"torsearchengine1": "http://searcherc3uwk535.onion",
17	-	"torgle": "http://torglejzid2cyoqt.onion",
18	-	"torgle1": "http://torgle5fj664v7pf.onion",
19	-	"onionsearchengine": "http://onionf4j3fwqpeo5.onion",
20	-	"tordex": "http://tordex7iie7z2wcg.onion",
21	-	"tor66": "http://tor66sezptuu2nta.onion",
22	-	"tormax": "http://tormaxunodsbvtgo.onion",
23	-	"haystack": "http://haystakvxad7wbk5.onion",
24	-	"multivac": "http://multivacigqzqqon.onion",
25	-	"evosearch": "http://evo7no6twwwrm63c.onion",
26	-	"oneirun": "http://oneirunda366dmfm.onion",
27	-	"deeplink": "http://deeplinkdeatbml7.onion",
28	-	}
29	-

■ ■ ■ ■ ■ ■

onionsearch/__init__.py

1 + from onionsearch.core import *
2 +

All occurrences

■ ■ ■ ■ ■ ■

search.py onionsearch/core.py

		skipped 5 lines
6	6		from datetime import datetime
7	7		from functools import reduce
8	8		from random import choice
9		-
10	9		from multiprocessing import Pool, cpu_count, current_process, freeze_support
11	10		from tqdm import tqdm
12	11
		skipped 5 lines
18	17		from bs4 import BeautifulSoup
19	18		from urllib3.exceptions import ProtocolError
20	19
21		-	import engines
	20	+	ENGINES = {
	21	+	"ahmia": "http://msydqstlz2kzerdg.onion",
	22	+	"darksearchio": "http://darksearch.io",
	23	+	"onionland": "http://3bbad7fauom4d6sgppalyqddsqbf5u5p56b5k5uk2zxsy3d6ey2jobad.onion",
	24	+	"notevil": "http://hss3uro2hsxfogfq.onion",
	25	+	"darksearchenginer": "http://7pwy57iklvt6lyhe.onion",
	26	+	"phobos": "http://phobosxilamwcg75xt22id7aywkzol6q6rfl2flipcqoc4e4ahima5id.onion",
	27	+	"onionsearchserver": "http://oss7wrm7xvoub77o.onion",
	28	+	"torgle": "http://submarhglcl66nz6.onion/",
	29	+	"torgle1": "http://torgle5fj664v7pf.onion",
	30	+	"onionsearchengine": "http://onionf4j3fwqpeo5.onion",
	31	+	"tordex": "http://tordex7iie7z2wcg.onion",
	32	+	"tor66": "http://tor66sezptuu2nta.onion",
	33	+	"tormax": "http://tormaxunodsbvtgo.onion",
	34	+	"haystack": "http://haystakvxad7wbk5.onion",
	35	+	"multivac": "http://multivacigqzqqon.onion",
	36	+	"evosearch": "http://evo7no6twwwrm63c.onion",
	37	+	"deeplink": "http://deeplinkdeatbml7.onion",
	38	+	}
22	39
23	40		desktop_agents = [
24	41		'Mozilla/5.0 (Windows NT 6.1; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/54.0.2840.99 Safari/537.36',
		skipped 12 lines
37	54		'Mozilla/5.0 (Windows NT 10.0; WOW64; rv:50.0) Gecko/20100101 Firefox/50.0'
38	55		]
39	56
40		-	supported_engines = engines.ENGINES
	57	+	supported_engines = ENGINES
41	58
42	59		available_csv_fields = [
43	60		"engine",
		skipped 78 lines
122	139		return results
123	140
124	141
125		-	def torch(searchstr):
126		-	results = []
127		-	torch_url = supported_engines['torch'] + "/4a1f6b371c/search.cgi?cmd=Search!&np={}&q={}"
128		-	results_per_page = 10
129		-	max_nb_page = 100
130		-	if args.limit != 0:
131		-	max_nb_page = args.limit
132		-
133		-	with requests.Session() as s:
134		-	s.proxies = proxies
135		-	s.headers = random_headers()
136		-
137		-	req = s.get(torch_url.format(0, quote(searchstr)))
138		-	soup = BeautifulSoup(req.text, 'html5lib')
139		-
140		-	page_number = 1
141		-	for i in soup.find("table", attrs={"width": "100%"}).find_all("small"):
142		-	if i.get_text() is not None and "of" in i.get_text():
143		-	page_number = math.ceil(float(clear(i.get_text().split("-")[1].split("of")[1])) / results_per_page)
144		-	if page_number > max_nb_page:
145		-	page_number = max_nb_page
146		-
147		-	pos = get_proc_pos()
148		-	with tqdm(total=page_number, initial=0, desc=get_tqdm_desc("TORCH", pos), position=pos) as progress_bar:
149		-
150		-	results = link_finder("torch", soup)
151		-	progress_bar.update()
152		-
153		-	# Usually range is 2 to n+1, but TORCH behaves differently
154		-	for n in range(1, page_number):
155		-	req = s.get(torch_url.format(n, quote(searchstr)))
156		-	soup = BeautifulSoup(req.text, 'html5lib')
157		-	results = results + link_finder("torch", soup)
158		-	progress_bar.update()
159		-
160		-	return results
161		-
162		-
163		-	def torch1(searchstr):
164		-	results = []
165		-	torch1_url = supported_engines['torch1'] + "/search?q={}&cmd=Search!"
166		-
167		-	pos = get_proc_pos()
168		-	with tqdm(total=1, initial=0, desc=get_tqdm_desc("TORCH 1", pos), position=pos) as progress_bar:
169		-	response = requests.get(torch1_url.format(quote(searchstr)), proxies=proxies, headers=random_headers())
170		-	soup = BeautifulSoup(response.text, 'html5lib')
171		-	results = link_finder("torch1", soup)
172		-	progress_bar.update()
173		-
174		-	return results
175	142
176	143
177	144		def darksearchio(searchstr):
		skipped 120 lines
298	265		return results
299	266
300	267
301		-	def visitor(searchstr):
302		-	results = []
303		-	visitor_url = supported_engines['visitor'] + "/search/?q={}&page={}"
304		-	max_nb_page = 30
305		-	if args.limit != 0:
306		-	max_nb_page = args.limit
307		-
308		-	pos = get_proc_pos()
309		-	with tqdm(total=max_nb_page, initial=0, desc=get_tqdm_desc("VisiTOR", pos), position=pos) as progress_bar:
310		-	continue_processing = True
311		-	page_to_request = 1
312		-
313		-	with requests.Session() as s:
314		-	s.proxies = proxies
315		-	s.headers = random_headers()
316		-
317		-	while continue_processing:
318		-	resp = s.get(visitor_url.format(quote(searchstr), page_to_request))
319		-	soup = BeautifulSoup(resp.text, 'html5lib')
320		-	results = results + link_finder("visitor", soup)
321		-	progress_bar.update()
322		-
323		-	next_page = soup.find('a', text="Next »")
324		-	if next_page is None or page_to_request >= max_nb_page:
325		-	continue_processing = False
326		-
327		-	page_to_request += 1
328		-
329		-	return results
330	268
331	269
332	270		def darksearchenginer(searchstr):
		skipped 118 lines
451	389		return results
452	390
453	391
454		-	def grams(searchstr):
455		-	results = []
456		-	# No multi pages handling as it is very hard to get many results on this engine
457		-	grams_url1 = supported_engines['grams']
458		-	grams_url2 = supported_engines['grams'] + "/results"
459		-
460		-	with requests.Session() as s:
461		-	s.proxies = proxies
462		-	s.headers = random_headers()
463		-
464		-	resp = s.get(grams_url1)
465		-	soup = BeautifulSoup(resp.text, 'html5lib')
466		-	token = soup.find('input', attrs={'name': '_token'})['value']
467		-
468		-	pos = get_proc_pos()
469		-	with tqdm(total=1, initial=0, desc=get_tqdm_desc("Grams", pos), position=pos) as progress_bar:
470		-	resp = s.post(grams_url2, data={"req": searchstr, "_token": token})
471		-	soup = BeautifulSoup(resp.text, 'html5lib')
472		-	results = link_finder("grams", soup)
473		-	progress_bar.update()
474		-
475		-	return results
476		-
477		-
478		-	def candle(searchstr):
479		-	results = []
480		-	candle_url = supported_engines['candle'] + "/?q={}"
481		-
482		-	pos = get_proc_pos()
483		-	with tqdm(total=1, initial=0, desc=get_tqdm_desc("Candle", pos), position=pos) as progress_bar:
484		-	response = requests.get(candle_url.format(quote(searchstr)), proxies=proxies, headers=random_headers())
485		-	soup = BeautifulSoup(response.text, 'html5lib')
486		-	results = link_finder("candle", soup)
487		-	progress_bar.update()
488		-
489		-	return results
490		-
491		-
492		-	def torsearchengine(searchstr):
493		-	results = []
494		-	torsearchengine_url = supported_engines['torsearchengine'] + "/search/move/?q={}&pn={}&num=10&sdh=&"
495		-	max_nb_page = 100
496		-	if args.limit != 0:
497		-	max_nb_page = args.limit
498		-
499		-	with requests.Session() as s:
500		-	s.proxies = proxies
501		-	s.headers = random_headers()
502		-
503		-	resp = s.get(torsearchengine_url.format(quote(searchstr), 1))
504		-	soup = BeautifulSoup(resp.text, 'html5lib')
505		-
506		-	page_number = 1
507		-	for i in soup.find_all('div', attrs={"id": "subheader"}):
508		-	if i.get_text() is not None and "of" in i.get_text():
509		-	total_results = int(i.find('p').find_all('b')[2].get_text().replace(",", ""))
510		-	results_per_page = 10
511		-	page_number = math.ceil(total_results / results_per_page)
512		-	if page_number > max_nb_page:
513		-	page_number = max_nb_page
514		-
515		-	pos = get_proc_pos()
516		-	with tqdm(total=page_number, initial=0, desc=get_tqdm_desc("Tor Search Engine", pos), position=pos) \
517		-	as progress_bar:
518		-
519		-	results = link_finder("torsearchengine", soup)
520		-	progress_bar.update()
521		-
522		-	for n in range(2, page_number + 1):
523		-	resp = s.get(torsearchengine_url.format(quote(searchstr), n))
524		-	soup = BeautifulSoup(resp.text, 'html5lib')
525		-	results = results + link_finder("torsearchengine", soup)
526		-	progress_bar.update()
527		-
528		-	return results
529		-
530	392
531	393		def torgle(searchstr):
532	394		results = []
		skipped 127 lines
660	522
661	523		def tormax(searchstr):
662	524		results = []
663		-	tormax_url = supported_engines['tormax'] + "/tormax/search?q={}"
	525	+	tormax_url = supported_engines['tormax'] + "/search?q={}"
664	526
665	527		pos = get_proc_pos()
666	528		with tqdm(total=1, initial=0, desc=get_tqdm_desc("Tormax", pos), position=pos) as progress_bar:
		skipped 124 lines
791	653		return results
792	654
793	655
794		-	def oneirun(searchstr):
795		-	results = []
796		-	oneirun_url = supported_engines['oneirun'] + "/Home/IndexEn"
797		-
798		-	with requests.Session() as s:
799		-	s.proxies = proxies
800		-	s.headers = random_headers()
801		-
802		-	resp = s.get(oneirun_url)
803		-	soup = BeautifulSoup(resp.text, 'html5lib')
804		-	token = soup.find('input', attrs={"name": "__RequestVerificationToken"})['value']
805		-
806		-	pos = get_proc_pos()
807		-	with tqdm(total=1, initial=0, desc=get_tqdm_desc("Oneirun", pos), position=pos) as progress_bar:
808		-	response = s.post(oneirun_url.format(quote(searchstr)), data={
809		-	"searchString": searchstr,
810		-	"__RequestVerificationToken": token
811		-	})
812		-	soup = BeautifulSoup(response.text, 'html5lib')
813		-	results = link_finder("oneirun", soup)
814		-	progress_bar.update()
815		-
816		-	return results
817		-
818	656
819	657		def deeplink(searchstr):
820	658		results = []
		skipped 15 lines
836	674		return results
837	675
838	676
839		-	def torsearchengine1(searchstr):
840		-	results = []
841		-	torsearchengine1_url1 = supported_engines['torsearchengine1']
842		-	torsearchengine1_url2 = supported_engines['torsearchengine1'] + "/index.php"
843		-
844		-	with requests.Session() as s:
845		-	s.proxies = proxies
846		-	s.headers = random_headers()
847		-	s.get(torsearchengine1_url1)
848		-
849		-	pos = get_proc_pos()
850		-	with tqdm(total=1, initial=0, desc=get_tqdm_desc("TOR Search Engine 1", pos), position=pos) as progress_bar:
851		-	response = s.post(torsearchengine1_url2, {'search': searchstr, 'search2': ''})
852		-	soup = BeautifulSoup(response.text, 'html5lib')
853		-	results = link_finder("torsearchengine1", soup)
854		-	progress_bar.update()
855		-
856		-	return results
857		-
858	677
859	678		def torgle1(searchstr):
860	679		results = []
		skipped 35 lines
896	715		return results
897	716
898	717
899		-	def grams1(searchstr):
900		-	results = []
901		-	grams1_url = supported_engines['grams1'] + "/results/index.php?page={}&searchstr={}"
902		-	results_per_page = 25
903		-	max_nb_page = 30
904		-	if args.limit != 0:
905		-	max_nb_page = args.limit
906		-
907		-	with requests.Session() as s:
908		-	s.proxies = proxies
909		-	s.headers = random_headers()
910		-
911		-	resp = s.get(grams1_url.format(1, quote(searchstr)))
912		-	soup = BeautifulSoup(resp.text, 'html5lib')
913		-
914		-	page_number = 1
915		-	pages = soup.find_all('div', attrs={"class": "result-text"})
916		-	if pages is not None:
917		-	res_re = re.match(r"About ([0-9]+) result(.*)", clear(pages[0].get_text()))
918		-	total_results = int(res_re.group(1))
919		-	page_number = math.ceil(total_results / results_per_page)
920		-	if page_number > max_nb_page:
921		-	page_number = max_nb_page
922		-
923		-	pos = get_proc_pos()
924		-	with tqdm(total=page_number, initial=0, desc=get_tqdm_desc("Grams 1", pos), position=pos) as progress_bar:
925		-	results = link_finder("grams1", soup)
926		-	progress_bar.update()
927		-
928		-	for n in range(2, page_number + 1):
929		-	resp = s.get(grams1_url.format(n, quote(searchstr)))
930		-	soup = BeautifulSoup(resp.text, 'html5lib')
931		-	results = results + link_finder("grams1", soup)
932		-	progress_bar.update()
933		-
934		-	return results
935	718
936	719
937	720		def get_domain_from_url(link):
		skipped 47 lines
985	768		link = r.find('a')['href'].split('redirect_url=')[1]
986	769		add_link()
987	770
988		-	if engine_str == "candle":
989		-	for r in data_obj.select("body h2 a"):
990		-	if str(r['href']).startswith("http"):
991		-	name = clear(r.get_text())
992		-	link = clear(r['href'])
993		-	add_link()
994		-
995	771		if engine_str == "darksearchenginer":
996	772		for r in data_obj.select('.table-responsive a'):
997	773		name = clear(r.get_text())
		skipped 20 lines
1018	794		link = get_parameter(r['href'], 'url')
1019	795		add_link()
1020	796
1021		-	if engine_str == "grams":
1022		-	for i in data_obj.find_all("div", attrs={"class": "media-body"}):
1023		-	if not i.find('span'):
1024		-	for r in i.select(".searchlinks a"):
1025		-	name = clear(r.get_text())
1026		-	link = clear(r['href'])
1027		-	add_link()
1028	797
1029		-	if engine_str == "grams1":
1030		-	for r in data_obj.select(".searchlinks a"):
1031		-	name = clear(r.get_text())
1032		-	link = clear(r['href'])
1033		-	add_link()
1034	798
1035	799		if engine_str == "haystack":
1036	800		for r in data_obj.select(".result b a"):
		skipped 11 lines
1048	812		break
1049	813
1050	814		if engine_str == "notevil":
1051		-	for r in data_obj.select('#content > div > p > a:not([target])'):
	815	+	for r in data_obj.find_all("p"):
	816	+	r=r.find("a")
1052	817		name = clear(r.get_text())
1053		-	link = get_parameter(r['href'], 'url')
	818	+	link = unquote(r["href"]).split('./r2d.php?url=')[1].split('&')[0]
1054	819		add_link()
1055	820
1056		-	if engine_str == "oneirun":
1057		-	for td in data_obj.find_all('td', attrs={"style": "vertical-align: top;"}):
1058		-	name = clear(td.find('h5').get_text())
1059		-	link = clear(td.find('a')['href'])
1060		-	add_link()
1061	821
1062	822		if engine_str == "onionland":
1063	823		for r in data_obj.select('.result-block .title a'):
		skipped 27 lines
1091	851		link = clear(i.find('a')['href'])
1092	852		add_link()
1093	853
1094		-	if engine_str == "torch":
1095		-	for r in data_obj.select("dl > dt > a"):
1096		-	name = clear(r.get_text())
1097		-	link = clear(r['href'])
1098		-	add_link()
1099		-
1100		-	if engine_str == "torch1":
1101		-	for r in data_obj.select("dl > dt > a"):
1102		-	name = clear(r.get_text())
1103		-	link = clear(r['href'])
1104		-	add_link()
1105	854
1106	855		if engine_str == "tordex":
1107	856		for r in data_obj.select('.container h5 a'):
		skipped 17 lines
1125	874		add_link()
1126	875
1127	876		if engine_str == "tormax":
1128		-	for r in data_obj.select("#search-results article a.title"):
1129		-	name = clear(r.get_text())
1130		-	link = clear(r.find_next_sibling('div', {'class': 'url'}).get_text())
	877	+	for r in data_obj.find_all("section",attrs={"id":"search-results"})[0].find_all("article"):
	878	+	name = clear(r.find('a',attrs={"class":"title"}).get_text())
	879	+	link = clear(r.find('div',attrs={"class":"url"}).get_text())
1131	880		add_link()
1132	881
1133		-	if engine_str == "torsearchengine":
1134		-	for i in data_obj.find_all('h3', attrs={'class': 'title text-truncate'}):
1135		-	name = clear(i.find('a').get_text())
1136		-	link = i.find('a')['data-uri']
1137		-	add_link()
1138	882
1139		-	if engine_str == "torsearchengine1":
1140		-	for r in data_obj.find_all('span', {'style': 'font-size:1.2em;font-weight:bold;color:#1a0dab'}):
1141		-	name = clear(r.get_text())
1142		-	link = r.find_next_sibling('a')['href']
1143		-	add_link()
1144		-
1145		-	if engine_str == "visitor":
1146		-	for r in data_obj.select(".hs_site h3 a"):
1147		-	name = clear(r.get_text())
1148		-	link = clear(r['href'])
1149		-	add_link()
1150	883
1151	884		if args.continuous_write and not csv_file.closed:
1152	885		csv_file.close()
		skipped 7 lines
1160	893		ret = []
1161	894		try:
1162	895		ret = globals()[method_name](argument)
1163		-	except ConnectionError:
1164		-	print("Error: unable to connect")
1165		-	except OSError:
1166		-	print("Error: unable to connect")
1167		-	except ProtocolError:
	896	+	except:
1168	897		print("Error: unable to connect")
1169	898		return ret
1170	899
		skipped 59 lines
1230	959		total += n
1231	960		print(" Total: {} links written to {}".format(str(total), filename))
1232	961
1233		-
1234		-	if __name__ == "__main__":
1235		-	scrape()
1236		-

■ ■ ■ ■ ■ ■

requirements.txt

1 - requests
2 - beautifulsoup4
3 - tqdm
4 - argparse
5 - html5lib

All occurrences

■ ■ ■ ■ ■ ■

setup.py

1	+	# -- coding: utf-8 --
2	+	from setuptools import setup, find_packages
3	+
4	+
5	+	setup(
6	+	name='onionsearch',
7	+	version="1",
8	+	packages=find_packages(),
9	+	author="megadose",
10	+	install_requires=["requests","argparse","termcolor","tqdm", "html5lib","bs4"],
11	+	description="OnionSearch is a script that scrapes urls on different .onion search engines.",
12	+	include_package_data=True,
13	+	url='http://github.com/megadose/OnionSearch',
14	+	entry_points = {'console_scripts': ['onionsearch = onionsearch.core:scrape']},
15	+	classifiers=[
16	+	"Programming Language :: Python",
17	+	"License :: OSI Approved :: GNU General Public License v3 (GPLv3)",
18	+	],
19	+	)
20	+

1	-	requests
2	-	beautifulsoup4
3	-	tqdm
4	-	argparse
5	-	html5lib

Update