SickGear/sickbeard/bs4_parser.py
JackDandy 43778d8edd Change providers, add some, remove one, fix a snatch issue, rework ignore/require words, refactor code.
Add BitMeTV torrent provider.
Add TVChaosUK torrent provider.
Add HD-Space torrent provider.
Add Shazbat torrent provider.
Remove Animenzb provider.
Change use tvdbid for searching usenet providers.
Change consolidate global and per show ignore and require words functions.
Change "Require word" title and notes on Config Search page to properly describe its functional logic.
Add "Reject Blu-ray M2TS releases" to BTN provider.
Add regular expression capability to ignore and require words by starting wordlist with "regex:".
Add list shows with custom ignore and require words under the global counterparts on the Search Settings page.
Fix failure to search for more than one selected wanted episode.
2015-12-02 01:31:50 +00:00

28 lines
1.1 KiB
Python

from bs4 import BeautifulSoup
import re
class BS4Parser:
def __init__(self, *args, **kwargs):
# list type param of "feature" arg is not currently correctly tested by bs4 (r353)
# so for now, adjust param to provide possible values until the issue is addressed
kwargs_new = {}
for k, v in kwargs.items():
if 'features' in k and isinstance(v, list):
v = [item for item in v if item in ['html5lib', 'html.parser', 'html', 'lxml', 'xml']][0]
kwargs_new[k] = v
tag, attr = [x in kwargs_new and kwargs_new.pop(x) or y for (x, y) in [('tag', 'table'), ('attr', '')]]
if attr:
args = (re.sub(r'(?is).*(<%(tag)s[^>]+%(attr)s[^>]*>.*</%(tag)s>).*' % {'tag': tag, 'attr': attr},
r'<html><head></head><body>\1</body></html>', args[0]).strip(),) + args[1:]
self.soup = BeautifulSoup(*args, **kwargs_new)
def __enter__(self):
return self.soup
def __exit__(self, exc_ty, exc_val, tb):
self.soup.clear(True)
self.soup = None