README.md

   1 # Regex-WebScrapper
   2 A CLI for retrieving info from web pages done on python3.7 <br/><br/>
   3 python3 regexscraper -u "https://example.com" -r "(t|r)est"<br/>
   4 This outputs all regex matches on site -s with pattern -r using selenium JS rendered sites
   5 <br/>
   6 ## Important:
   7 Selenium is one of those things that takes up more memory than it needs to. Take that into account in use.
   8 ## Regex patterns:
   9 IP Addresses Tupple: ((([0-1]{0,1}[0-9]{0,2}|25[0-5]|2[0-4][0-9])\.){3}([0-1]{0,1}[0-9]{0,2}|25[0-5]|2[0-4][0-9])) This returns a tupple with the full address at the 0th index <br/>
  10 IP Address Sort of: \d{1,3}\.\d{1,3}\.\d{1,3}\.\d{1,3} Returns a string of the IP address that is not nescicarily accurate
  11 ## Things:
  12 https://stackoverflow.com/questions/39547598/selenium-common-exceptions-webdriverexception-message-connection-refused<br/>
  13 https://stackoverflow.com/questions/8220108/how-do-i-check-the-operating-system-in-python<br/>
  14 https://selenium-python.readthedocs.io/api.html<br/>
  15 https://stackoverflow.com/questions/1285917/how-to-disable-javascript-when-using-selenium/51681608#51681608<br/>
  16