自然语言处理之词袋模型Bag_of_words

自然语言处理之词袋模型Bag_of_words文章目录读取训练数据BeautifulSoup处理获取词袋和向量预测结果使用随机森林分类器进行分类输出提交结果尝试使用xgb还是随机森林好用教程地址:https://www.kaggle.com/c/word2vec-nlp-tutorial/overview/part-1-for-beginners-bag-of-words读取训练数据训练数据的内容是2500条电影评论。impor…

大家好,又见面了,我是你们的朋友全栈君。

教程地址:


https://www.kaggle.com/c/word2vec-nlp-tutorial/overview/part-1-for-beginners-bag-of-words

读取训练数据

训练数据的内容是2500条电影评论。

import pandas as pd
train = pd.read_csv("./data/labeledTrainData.tsv", header=0, delimiter="\t", quoting=3)
train.head(3)
id sentiment review
0 “5814_8” 1 “With all this stuff going down at the moment …
1 “2381_9” 1 “\”The Classic War of the Worlds\” by Timothy …
2 “7759_3” 0 “The film starts with a manager (Nicholas Bell…
train.shape
(25000, 3)
example = train['review'][0]
example
'"With all this stuff going down at the moment with MJ i\'ve started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ\'s feeling towards the press and also the obvious message of drugs are bad m\'kay.<br /><br />Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.<br /><br />The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci\'s character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ\'s music.<br /><br />Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.<br /><br />Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ\'s bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i\'ve gave this subject....hmmm well i don\'t know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."'

train当中的review项里面包含的数据是HTML类型,为了去除HTML标签,保存纯粹的评论,使用BeautifulSoup。

BeautifulSoup处理

from bs4 import BeautifulSoup
# 创建 beautifulsoup 对象
soup = BeautifulSoup(example)
#格式化输出内容
print(soup.prettify())
<html>
 <body>
  <p>
   "With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.
   <br/>
   <br/>
   Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.
   <br/>
   <br/>
   The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.
   <br/>
   <br/>
   Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.
   <br/>
   <br/>
   Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."
  </p>
 </body>
</html>
# 查找各个标签
print(soup.title)
print(soup.head)
print(soup.a)
print(soup.p)
None
None
None
<p>"With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.<br/><br/>Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.<br/><br/>The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.<br/><br/>Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.<br/><br/>Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."</p>
# 遍历孩子
for child in  soup.body.children:
    print (child)
<p>"With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.<br/><br/>Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.<br/><br/>The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.<br/><br/>Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.<br/><br/>Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."</p>
# find_all是一个很神奇的函数,可以传入字符、列表、正则表达式、函数等等等。
print(soup.find_all('br'))
[<br/>, <br/>, <br/>, <br/>, <br/>, <br/>, <br/>, <br/>]
# 可以在soup.select里面直接使用css代码
print(soup.select('.p'))
[]
# 获取text
print(soup.get_text())
"With all this stuff going down at the moment with MJ i've started listening to his music, watching the odd documentary here and there, watched The Wiz and watched Moonwalker again. Maybe i just want to get a certain insight into this guy who i thought was really cool in the eighties just to maybe make up my mind whether he is guilty or innocent. Moonwalker is part biography, part feature film which i remember going to see at the cinema when it was originally released. Some of it has subtle messages about MJ's feeling towards the press and also the obvious message of drugs are bad m'kay.Visually impressive but of course this is all about Michael Jackson so unless you remotely like MJ in anyway then you are going to hate this and find it boring. Some may call MJ an egotist for consenting to the making of this movie BUT MJ and most of his fans would say that he made it for the fans which if true is really nice of him.The actual feature film bit when it finally starts is only on for 20 minutes or so excluding the Smooth Criminal sequence and Joe Pesci is convincing as a psychopathic all powerful drug lord. Why he wants MJ dead so bad is beyond me. Because MJ overheard his plans? Nah, Joe Pesci's character ranted that he wanted people to know it is he who is supplying drugs etc so i dunno, maybe he just hates MJ's music.Lots of cool things in this like MJ turning into a car and a robot and the whole Speed Demon sequence. Also, the director must have had the patience of a saint when it came to filming the kiddy Bad sequence as usually directors hate working with one kid let alone a whole bunch of them performing a complex dance scene.Bottom line, this movie is for people who like MJ on one level or another (which i think is most people). If not, then stay away. It does try and give off a wholesome message and ironically MJ's bestest buddy in this movie is a girl! Michael Jackson is truly one of the most talented people ever to grace this planet but is he guilty? Well, with all the attention i've gave this subject....hmmm well i don't know because people can be different behind closed doors, i know this for a fact. He is either an extremely nice but stupid guy or one of the most sickest liars. I hope he is not the latter."
import nltk
import re
from nltk.corpus import stopwords
from nltk.stem.lancaster import LancasterStemmer
lancaster_stemmer = LancasterStemmer()
print (stopwords.words("english"))
['i', 'me', 'my', 'myself', 'we', 'our', 'ours', 'ourselves', 'you', "you're", "you've", "you'll", "you'd", 'your', 'yours', 'yourself', 'yourselves', 'he', 'him', 'his', 'himself', 'she', "she's", 'her', 'hers', 'herself', 'it', "it's", 'its', 'itself', 'they', 'them', 'their', 'theirs', 'themselves', 'what', 'which', 'who', 'whom', 'this', 'that', "that'll", 'these', 'those', 'am', 'is', 'are', 'was', 'were', 'be', 'been', 'being', 'have', 'has', 'had', 'having', 'do', 'does', 'did', 'doing', 'a', 'an', 'the', 'and', 'but', 'if', 'or', 'because', 'as', 'until', 'while', 'of', 'at', 'by', 'for', 'with', 'about', 'against', 'between', 'into', 'through', 'during', 'before', 'after', 'above', 'below', 'to', 'from', 'up', 'down', 'in', 'out', 'on', 'off', 'over', 'under', 'again', 'further', 'then', 'once', 'here', 'there', 'when', 'where', 'why', 'how', 'all', 'any', 'both', 'each', 'few', 'more', 'most', 'other', 'some', 'such', 'no', 'nor', 'not', 'only', 'own', 'same', 'so', 'than', 'too', 'very', 's', 't', 'can', 'will', 'just', 'don', "don't", 'should', "should've", 'now', 'd', 'll', 'm', 'o', 're', 've', 'y', 'ain', 'aren', "aren't", 'couldn', "couldn't", 'didn', "didn't", 'doesn', "doesn't", 'hadn', "hadn't", 'hasn', "hasn't", 'haven', "haven't", 'isn', "isn't", 'ma', 'mightn', "mightn't", 'mustn', "mustn't", 'needn', "needn't", 'shan', "shan't", 'shouldn', "shouldn't", 'wasn', "wasn't", 'weren', "weren't", 'won', "won't", 'wouldn', "wouldn't"]

获取词袋和向量

# 写一个处理函数
def review_to_words( raw_review ):
    review_text = BeautifulSoup(raw_review).get_text()
    # 去除标点和数字,仅保留强烈语气词
    letters_only = re.sub("[^a-zA-Z?!]", " ", review_text) 
    # 统一转换为小写字母
    words = letters_only.lower().split()                             
    # 由于set的搜索速度更快,所以把list转换成set
    stops = set(stopwords.words("english"))                  
    # 移除停止词,并且将词转为原形形式
    meaningful_words = [lancaster_stemmer.stem(w) for w in words if not w in stops]   
    # 返回标准语句
    return( " ".join( meaningful_words ))  
# 获取字符串列表
num_reviews = train["review"].size
clean_train_reviews = []
for i in range(num_reviews):
    clean_train_reviews.append( review_to_words( train["review"][i] ) )

uk edit show rath less extrav us vert person concern get new kitch perhap bedroom bathroom wond grat got us vert show everyth real tv instead mak improv hous occup could afford entir hous get rebuilt know show try show lousy welf system ex us beg hard enough receiv rath vulg produc plac tak plac particul sear also uncal rsther turn on famy depr are pot millionair would far bet help commun whol instead spend hundr thousand doll on hom build someth whol commun perhap plac diy pow tool borrow return along build mat everyon benefit want giv on person caus enorm res among rest loc commun stil liv run hous

在进行下一步之前,有必要介绍一下词袋模型,要将几个句子转化成向量,第一步是把它们包含的所有词不重复地装到一个袋子里,然后这几个句子就可以转换成和袋子里的词的数量一样长的向量,这个向量的每一个位置都对应着袋子里面某一个词在句子中出现的次数,如果没有出现就是0.

from sklearn.feature_extraction.text import CountVectorizer
vectorizer = CountVectorizer(analyzer = "word", tokenizer = None, preprocessor = None, stop_words = None, max_features = 5000)
train_data_features = vectorizer.fit_transform(clean_train_reviews)
train_data_features = train_data_features.toarray()
print(train_data_features.shape)
(25000, 5000)

查看词袋里面装的具体内容

vocab = vectorizer.get_feature_names()
print(len(vocab))
print(vocab)
5000
['abandon', 'abbot', 'abc', 'abduc', 'abl', 'abomin', 'aborigin', 'abort', 'abound', 'about', 'abraham', 'abrupt', 'abs', 'absolv', 'absorb', 'absurd', 'abud', 'abund', 'abus', 'abysm', 'ac', 'academy', 'acc', 'acceiv', 'access', 'accid', 'acclaim', 'accompany', 'accompl', 'accord', 'account', 'accus', 'ach', 'achiev', 'acid', 'acknowledg', 'acquaint', 'acquir', 'across', 'act', 'actress', 'ad', 'adam', 'adapt', 'addict', 'addit', 'address', 'adel', 'adequ', 'adjust', 'admin', 'admir', 'admit', 'adolesc', 'adopt', 'adr', 'adult', 'adv', 'advers', 'advert', 'aesthet', 'af', 'affair', 'affect', 'affirm', 'affleck', 'afford', 'afr', 'afraid', 'afric', 'afterma', 'afternoon', 'afterward', 'ag', 'again', 'agend', 'aggress', 'ago', 'agon', 'agony', 'agr', 'agree', 'ah', 'ahead', 'aid', 'aim', 'aimless', 'air', 'airpl', 'airport', 'ak', 'akin', 'akshay', 'al', 'ala', 'alarm', 'alb', 'albeit', 'albert', 'alcohol', 'alec', 'alert', 'alex', 'alexand', 'alexandr', 'alfr', 'alic', 'alik', 'alison', 'all', 'alleg', 'alley', 'allow', 'allud', 'almost', 'alon', 'along', 'alongsid', 'alot', 'already', 'alright', 'also', 'alt', 'altern', 'although', 'altm', 'altogeth', 'alvin', 'alway', 'aly', 'am', 'amand', 'amaz', 'amazon', 'amb', 'ambigu', 'ambit', 'amby', 'americ', 'amidst', 'amitabh', 'among', 'amongst', 'amount', 'ampl', 'amrit', 'amus', 'amy', 'an', 'analys', 'anch', 'anct', 'and', 'anderson', 'andr', 'andre', 'andrew', 'andy', 'ang', 'angel', 'angl', 'angry', 'angst', 'angy', 'anil', 'anim', 'ann', 'annount', 'annoy', 'anny', 'anoth', 'answ', 'ant', 'antagon', 'antholog', 'anthony', 'anticip', 'anton', 'antonio', 'antonion', 'antwon', 'anxy', 'anybody', 'anyhow', 'anym', 'anyon', 'anyone', 'anyth', 'anytim', 'anyway', 'anywh', 'ap', 'apart', 'apocalypt', 'apolog', 'app', 'appal', 'appear', 'appl', 'applaud', 'apply', 'apprecy', 'approach', 'appropry', 'approv', 'approxim', 'april', 'apt', 'ar', 'arab', 'arc', 'arch', 'archaeolog', 'architect', 'are', 'area', 'argentin', 'argu', 'ariel', 'aristocr', 'arkin', 'arm', 'armstrong', 'army', 'arnold', 'around', 'arquet', 'arrang', 'arrest', 'arrog', 'arrow', 'art', 'arth', 'artic', 'artsy', 'artwork', 'arty', 'as', 'ash', 'asham', 'ashley', 'asid', 'ask', 'asleep', 'aspect', 'aspir', 'ass', 'assassin', 'assault', 'assembl', 'assert', 'asset', 'assign', 'assist', 'assocy', 'assort', 'assum', 'astair', 'aston', 'astound', 'astronaut', 'asyl', 'at', 'athlet', 'atl', 'atmosph', 'atroc', 'atrocy', 'attach', 'attack', 'attempt', 'attenborough', 'attend', 'attitud', 'attorney', 'attract', 'attribut', 'aud', 'audio', 'audit', 'audrey', 'audy', 'august', 'aunt', 'aur', 'aussy', 'aust', 'austin', 'austral', 'aut', 'auth', 'auto', 'autobiograph', 'autom', 'av', 'avail', 'aveng', 'avid', 'avoid', 'aw', 'await', 'awak', 'award', 'away', 'awesom', 'awhil', 'awkward', 'ax', 'aztec', 'bab', 'baby', 'babysit', 'bacal', 'bach', 'bachch', 'bachel', 'back', 'backdrop', 'background', 'backst', 'backward', 'bacon', 'bad', 'baddy', 'baffl', 'bag', 'bait', 'bak', 'baksh', 'bal', 'bald', 'baldwin', 'ballet', 'ban', 'band', 'bang', 'bank', 'bant', 'bar', 'barb', 'barbar', 'barbr', 'bargain', 'bark', 'barn', 'barney', 'baron', 'barrel', 'barry', 'barrym', 'bas', 'basebal', 'bash', 'basket', 'basketbal', 'bastard', 'bat', 'bath', 'bathroom', 'batm', 'battl', 'battlefield', 'bau', 'bay', 'bbc', 'be', 'beach', 'bean', 'bear', 'beard', 'beast', 'beat', 'beatl', 'beatty', 'beauty', 'beav', 'becam', 'beckham', 'beckins', 'becom', 'bed', 'bedroom', 'beer', 'beetl', 'befriend', 'beg', 'begin', 'begun', 'behav', 'behavio', 'behavy', 'behind', 'behold', 'being', 'bel', 'belg', 'believ', 'belong', 'belov', 'belt', 'belush', 'ben', 'bend', 'benea', 'benefit', 'bennet', 'bent', 'beowulf', 'bergm', 'berkeley', 'berlin', 'bernard', 'besid', 'best', 'bet', 'betray', 'better', 'betty', 'bev', 'bew', 'bewild', 'beyond', 'bias', 'bibl', 'big', 'biggest', 'bik', 'bikin', 'biko', 'bil', 'bimbo', 'bin', 'bind', 'bing', 'biograph', 'biop', 'bir', 'bird', 'birthday', 'bit', 'bitch', 'bittersweet', 'bizar', 'bla', 'black', 'blackmail', 'blad', 'blah', 'blair', 'blak', 'blam', 'bland', 'blank', 'blast', 'blat', 'blaz', 'bleak', 'blee', 'blend', 'bless', 'blew', 'blind', 'blink', 'bliss', 'blob', 'block', 'blockbust', 'blond', 'blood', 'bloody', 'bloom', 'blossom', 'blow', 'blown', 'blu', 'blunt', 'blur', 'bo', 'board', 'boast', 'boat', 'bob', 'bobby', 'body', 'bog', 'bogart', 'boggl', 'boil', 'bol', 'bold', 'bollywood', 'bomb', 'bon', 'bond', 'bonny', 'boo', 'boob', 'boog', 'book', 'boom', 'boost', 'boot', 'bor', 'bord', 'boredom', 'born', 'borrow', 'boss', 'boston', 'both', 'bottl', 'bottom', 'bought', 'bound', 'bount', 'bounty', 'bourn', 'bout', 'bow', 'bowl', 'box', 'boy', 'boyfriend', 'boyl', 'brad', 'brady', 'brain', 'brainless', 'branagh', 'branch', 'brand', 'brando', 'brat', 'brav', 'braveheart', 'bravo', 'brazil', 'brea', 'bread', 'break', 'breakdown', 'breakfast', 'breast', 'breath', 'breathtak', 'bree', 'brend', 'brent', 'bret', 'bri', 'brick', 'brid', 'bridg', 'bridget', 'brief', 'bright', 'bril', 'bring', 'brit', 'britain', 'bro', 'broad', 'broadcast', 'broadway', 'brok', 'bronson', 'bront', 'brood', 'brook', 'brooklyn', 'brosn', 'broth', 'brought', 'brow', 'brown', 'bruc', 'bruno', 'brush', 'brut', 'bry', 'bsg', 'btw', 'bubbl', 'buck', 'bucket', 'bud', 'buddy', 'budget', 'buff', 'buffalo', 'bug', 'build', 'built', 'bul', 'bulk', 'bullet', 'bum', 'bumbl', 'bump', 'bunch', 'bunny', 'bur', 'burd', 'burk', 'burn', 'burst', 'burt', 'burton', 'bury', 'bus', 'busey', 'bush', 'businessm', 'bust', 'busy', 'but', 'butch', 'butl', 'button', 'buy', 'buzz', 'bye', 'cab', 'cabin', 'cabl', 'caf', 'cag', 'cagney', 'cain', 'cak', 'cal', 'calc', 'calib', 'californ', 'calm', 'cam', 'cambod', 'camcord', 'cameo', 'camer', 'camera', 'cameram', 'cameron', 'camp', 'campaign', 'campbel', 'campy', 'can', 'canad', 'cancel', 'candid', 'candl', 'candy', 'cannib', 'cannon', 'cannot', 'cant', 'canyon', 'cap', 'capac', 'capit', 'capot', 'capt', 'captain', 'car', 'card', 'cardboard', 'carel', 'cares', 'caretak', 'carey', 'carl', 'carlito', 'carlo', 'carm', 'carn', 'carol', 'carolin', 'caron', 'carp', 'carradin', 'carrey', 'carry', 'cart', 'cartoon', 'cary', 'cas', 'casablanc', 'cash', 'casino', 'casp', 'cassavet', 'cassidy', 'cast', 'castl', 'cat', 'catalog', 'catastroph', 'catch', 'catchy', 'categ', 'catherin', 'cathol', 'cattl', 'caught', 'caus', 'caut', 'cav', 'cbs', 'cd', 'ceas', 'cecil', 'ceil', 'cel', 'celebr', 'celest', 'celluloid', 'cemetery', 'cens', 'cent', 'century', 'cerebr', 'ceremony', 'certain', 'cg', 'cgi', 'chain', 'chainsaw', 'chair', 'challeng', 'chamb', 'chamberlain', 'champ', 'chan', 'chang', 'channel', 'chant', 'chao', 'chaplin', 'chapt', 'char', 'charact', 'charg', 'charism', 'charl', 'charlot', 'charlton', 'charm', 'chas', 'chat', 'chavez', 'che', 'cheadl', 'cheap', 'cheaply', 'check', 'cheek', 'chees', 'cheesy', 'chem', 'cher', 'chess', 'chest', 'chew', 'chib', 'chicago', 'chick', 'chief', 'chil', 'child', 'childr', 'chin', 'chines', 'chip', 'cho', 'chocol', 'choir', 'chok', 'chong', 'choos', 'chop', 'choppy', 'chor', 'choreograph', 'chos', 'chris', 'christ', 'christian', 'christianity', 'christians', 'christie', 'christina', 'christine', 'christmas', 'christopher', 'christy', 'chronicles', 'chuck', 'chuckl', 'church', 'churn', 'cia', 'cigaret', 'cinderell', 'cindy', 'cinem', 'cinema', 'cinematograph', 'circ', 'circumst', 'cit', 'city', 'civil', 'cla', 'clad', 'claim', 'clair', 'clan', 'clar', 'clark', 'clash', 'class', 'classm', 'classy', 'claud', 'claustrophob', 'claw', 'clay', 'cle', 'clear', 'clerk', 'clev', 'cli', 'clich', 'click', 'cliff', 'cliffhang', 'clim', 'climact', 'climax', 'climb', 'clin', 'clint', 'clip', 'cliv', 'cloak', 'clock', 'clon', 'clooney', 'clos', 'closest', 'closet', 'closeup', 'cloth', 'cloud', 'clown', 'clu', 'club', 'clueless', 'clumsy', 'clunky', 'clut', 'co', 'coach', 'coast', 'coat', 'cod', 'cody', 'coff', 'coffin', 'coh', 'coher', 'coincid', 'cok', 'col', 'cold', 'colin', 'coll', 'collab', 'collaps', 'colleagu', 'collect', 'colleg', 'collet', 'collin', 'colm', 'colo', 'colon', 'colonel', 'colony', 'columb', 'columbo', 'com', 'comb', 'combin', 'comeback', 'comedy', 'comfort', 'command', 'commend', 'commerc', 'commit', 'common', 'commun', 'comp', 'company', 'comparison', 'compass', 'compel', 'compens', 'compet', 'competit', 'compl', 'complain', 'complaint', 'complet', 'complex', 'comply', 'compos', 'composit', 'compound', 'compr', 'comprehend', 'comprom', 'compuls', 'comput', 'con', 'conceit', 'conceiv', 'concern', 'concert', 'conclud', 'concoct', 'cond', 'condemn', 'condit', 'conduc', 'conf', 'confess', 'confid', 'confin', 'confirm', 'conflict', 'confront', 'confus', 'congrat', 'connect', 'connery', 'conqu', 'conrad', 'conscy', 'consequ', 'conserv', 'consid', 'consist', 'conspir', 'const', 'constitut', 'construct', 'consum', 'cont', 'contact', 'contain', 'contemp', 'contempl', 'contempt', 'contend', 'contest', 'context', 'contin', 'continu', 'contract', 'contradict', 'contrast', 'contribut', 'control', 'controvers', 'controversy', 'conv', 'conveny', 'convers', 'convert', 'convey', 'convict', 'convint', 'convolv', 'cook', 'cooky', 'cool', 'coop', 'cop', 'cor', 'corbet', 'corey', 'corm', 'corn', 'corny', 'corp', 'corps', 'correct', 'corrid', 'corrupt', 'cost', 'costum', 'couch', 'could', 'counsel', 'count', 'counterpart', 'countless', 'country', 'countrysid', 'county', 'coup', 'coupl', 'cour', 'cours', 'court', 'courtroom', 'cousin', 'cov', 'cow', 'coward', 'cowboy', 'cox', 'crack', 'craft', 'craig', 'cram', 'crap', 'crappy', 'crash', 'crav', 'crawford', 'crawl', 'craz', 'crazy', 'cre', 'cream', 'creasy', 'cred', 'credit', 'creek', 'creep', 'creepy', 'crew', 'cri', 'crim', 'crimin', 'cring', 'crippl', 'cris', 'crisp', 'crit', 'crocodil', 'crook', 'crop', 'crosby', 'cross', 'crow', 'crowd', 'crown', 'cru', 'cruc', 'crud', 'cruel', 'crush', 'cry', 'crypt', 'cryst', 'cub', 'cue', 'culmin', 'cult', 'cum', 'cunningham', 'cup', 'cur', 'curios', 'curs', 'curt', 'curtain', 'cury', 'cusack', 'cush', 'custom', 'cut', 'cyborg', 'cyc', 'cyn', 'cyph', 'da', 'dad', 'daddy', 'dahl', 'dahm', 'dai', 'daisy', 'dal', 'dalton', 'dam', 'damn', 'damon', 'dan', 'dandy', 'dang', 'daniel', 'danny', 'dant', 'dar', 'dark', 'darl', 'darn', 'dash', 'dat', 'daught', 'dav', 'david', 'davy', 'dawn', 'dawson', 'day', 'daylight', 'dazzl', 'de', 'dea', 'dead', 'deaf', 'deal', 'dealt', 'dean', 'deann', 'dear', 'death', 'deb', 'debby', 'debr', 'debt', 'debut', 'dec', 'decad', 'decapit', 'deceas', 'deceiv', 'decid', 'deck', 'decl', 'declin', 'ded', 'dee', 'deem', 'deep', 'deeply', 'deer', 'def', 'defend', 'defens', 'defin', 'definit', 'defy', 'deg', 'degr', 'degrad', 'del', 'delay', 'delet', 'delib', 'delicy', 'delight', 'deliry', 'delivery', 'delud', 'delv', 'dem', 'demand', 'demil', 'democr', 'demon', 'demonst', 'den', 'deniro', 'denou', 'dent', 'deny', 'denzel', 'dep', 'depart', 'depend', 'depict', 'deprav', 'depress', 'depth', 'deputy', 'der', 'derang', 'derek', 'des', 'desc', 'descend', 'describ', 'desert', 'deserv', 'design', 'desir', 'desp', 'despair', 'despit', 'destin', 'destiny', 'destroy', 'destruct', 'det', 'detach', 'detail', 'detect', 'determin', 'detery', 'detract', 'detroit', 'dev', 'devast', 'develop', 'devil', 'devo', 'devoid', 'devot', 'devy', 'dialog', 'diamond', 'dian', 'diary', 'dick', 'dict', 'did', 'die', 'died', 'diff', 'difficul', 'difficult', 'dig', 'digest', 'digit', 'dign', 'dil', 'dilemm', 'dim', 'dimend', 'dimin', 'din', 'dinosa', 'dir', 'direct', 'dirt', 'dirty', 'dis', 'disagr', 'disappear', 'disappoint', 'disast', 'disbeliev', 'disc', 'discern', 'disciplin', 'disco', 'discov', 'discovery', 'discuss', 'diseas', 'disgrac', 'disgu', 'disgust', 'dish', 'disjoint', 'dislik', 'dism', 'dismiss', 'disney', 'disord', 'dispatch', 'display', 'dispos', 'disregard', 'disrespect', 'dissolv', 'dist', 'distinct', 'distort', 'distract', 'distress', 'distribut', 'district', 'disturb', 'div', 'divers', 'divert', 'divid', 'divin', 'divorc', 'dixon', 'dj', 'do', 'doc', 'doct', 'docu', 'dodg', 'dog', 'dogm', 'dol', 'doll', 'dolph', 'dom', 'domest', 'domin', 'domino', 'don', 'donald', 'donn', 'dont', 'doo', 'doom', 'door', 'dor', 'dorothy', 'dos', 'dot', 'doubl', 'doubt', 'dougla', 'down', 'downey', 'downhil', 'download', 'downright', 'doyl', 'doz', 'dr', 'drab', 'dracul', 'draft', 'drag', 'dragon', 'drain', 'drak', 'dram', 'drama', 'draw', 'drawn', 'dre', 'dread', 'dream', 'dreamy', 'dreck', 'dress', 'drew', 'drift', 'dril', 'drink', 'drip', 'driv', 'drivel', 'dron', 'drop', 'drov', 'drown', 'drug', 'drum', 'drunk', 'dry', 'du', 'dub', 'duby', 'duck', 'dud', 'dudley', 'due', 'duel', 'duh', 'duk', 'dul', 'dumb', 'dumbest', 'dump', 'dun', 'duo', 'dur', 'dust', 'dustin', 'dutch', 'duty', 'duval', 'dvd', 'dvds', 'dwarf', 'dwel', 'dying', 'dyl', 'dynam', 'dysfunct', 'eag', 'eagl', 'ear', 'earl', 'earn', 'earnest', 'eas', 'east', 'eastern', 'eastwood', 'easy', 'eat', 'ebert', 'ecc', 'echo', 'econom', 'ed', 'eddy', 'edg', 'edgy', 'edi', 'edison', 'edit', 'educ', 'edward', 'edy', 'eery', 'effect', 'efficy', 'effort', 'effortless', 'eg', 'ego', 'egypt', 'eight', 'eighty', 'einstein', 'eith', 'el', 'elab', 'eld', 'elect', 'electron', 'eleg', 'eleph', 'elev', 'elimin', 'elit', 'elizabe', 'elliot', 'elm', 'els', 'else', 'elsewh', 'elud', 'elv', 'elvir', 'em', 'embark', 'embarrass', 'embody', 'embrac', 'emerg', 'emil', 'emm', 'emot', 'emp', 'empath', 'empathy', 'emphas', 'empir', 'employ', 'empty', 'emy', 'en', 'ench', 'enco', 'encount', 'end', 'endear', 'ending', 'endless', 'enemy', 'energet', 'energy', 'enforc', 'eng', 'engin', 'engl', 'england', 'engross', 'enh', 'enigm', 'enjoy', 'enl', 'enlight', 'enorm', 'enough', 'ens', 'ensembl', 'ensu', 'ent', 'enterpr', 'entertain', 'enthral', 'enthusiasm', 'enthusiast', 'entir', 'entitl', 'entry', 'environ', 'envy', 'ep', 'episod', 'epitom', 'eq', 'equ', 'equip', 'er', 'eras', 'erik', 'erot', 'errol', 'escap', 'esp', 'espec', 'esquir', 'ess', 'est', 'esth', 'estrang', 'et', 'etc', 'etern', 'eth', 'ethn', 'eug', 'europ', 'ev', 'evalu', 'evelyn', 'ever', 'every', 'everybody', 'everyday', 'everyon', 'everyth', 'everywh', 'evid', 'evil', 'evok', 'evolv', 'ex', 'exact', 'exag', 'examin', 'exampl', 'exceiv', 'excel', 'excess', 'exchang', 'excit', 'exclud', 'excrucy', 'excus', 'execut', 'exempl', 'exerc', 'exhaust', 'exhibit', 'exit', 'exorc', 'exot', 'expand', 'expect', 'expedit', 'expend', 'expens', 'expert', 'expery', 'expl', 'explain', 'explicit', 'explod', 'exploit', 'expos', 'exposit', 'express', 'exquisit', 'ext', 'extend', 'extery', 'extinct', 'extr', 'extra', 'extraordin', 'extrem', 'ey', 'eyebrow', 'eyr', 'fab', 'fabl', 'fabr', 'fac', 'facil', 'fact', 'fad', 'fai', 'fail', 'faint', 'fair', 'fairbank', 'fairy', 'faith', 'fak', 'fal', 'falk', 'fallon', 'fals', 'fam', 'famili', 'famy', 'fan', 'fant', 'fantast', 'fantasy', 'far', 'farc', 'farm', 'farrel', 'fart', 'fasc', 'fascin', 'fash', 'fassbind', 'fast', 'fat', 'fath', 'fault', 'fav', 'favo', 'favorit', 'favourit', 'fay', 'fbi', 'fear', 'feast', 'feat', 'fed', 'fee', 'feebl', 'feel', 'feet', 'feinston', 'fel', 'felix', 'fellin', 'fellow', 'felt', 'fem', 'femin', 'feminin', 'fent', 'fer', 'ferrel', 'fest', 'fet', 'fetch', 'fev', 'fi', 'fiant', 'fict', 'fido', 'field', 'fiend', 'fierc', 'fif', 'fifteen', 'fifty', 'fig', 'fight', 'fil', 'film', 'filmmak', 'filt', 'filthy', 'fin', 'find', 'finest', 'fing', 'finney', 'fir', 'firm', 'first', 'fish', 'fishburn', 'fist', 'fit', 'fiv', 'fix', 'flag', 'flair', 'flam', 'flash', 'flashback', 'flashy', 'flat', 'flav', 'flaw', 'flawless', 'fle', 'fleet', 'flem', 'flesh', 'fli', 'flick', 'flight', 'flimsy', 'flip', 'flirt', 'flo', 'flock', 'flood', 'flop', 'flor', 'florid', 'flow', 'fluff', 'fluid', 'fly', 'flyn', 'foc', 'focus', 'fog', 'foil', 'folk', 'follow', 'fond', 'fontain', 'food', 'fool', 'foot', 'footbal', 'for', 'forbid', 'forc', 'ford', 'foreign', 'foremost', 'forest', 'forev', 'forg', 'forget', 'forgot', 'form', 'formul', 'formula', 'forrest', 'fort', 'fortun', 'forty', 'forward', 'fost', 'fought', 'foul', 'found', 'four', 'fox', 'foxx', 'frag', 'fragil', 'frail', 'fram', 'franch', 'francisco', 'franco', 'frank', 'frankenstein', 'franklin', 'franky', 'frant', 'fraud', 'fre', 'freak', 'freaky', 'fred', 'freddy', 'freedom', 'freem', 'freez', 'french', 'frenzy', 'frequ', 'fresh', 'fri', 'friday', 'friend', 'fright', 'frog', 'from', 'front', 'fronty', 'frost', 'froz', 'fruit', 'frust', 'fry', 'fu', 'fuel', 'ful', 'fulc', 'fulfil', 'fun', 'funct', 'fund', 'funda', 'funniest', 'funny', 'furnit', 'furtherm', 'fury', 'fut', 'fuzzy', 'fx', 'gabl', 'gabriel', 'gadget', 'gag', 'gain', 'gal', 'galactic', 'galaxy', 'gallery', 'gam', 'gambl', 'gamer', 'gandh', 'gang', 'gangst', 'gap', 'gar', 'garb', 'garbo', 'gard', 'garland', 'garn', 'gary', 'gas', 'gasp', 'gat', 'gath', 'gav', 'gay', 'gaz', 'gear', 'geek', 'gem', 'gen', 'gend', 'genet', 'geni', 'genius', 'genr', 'gentl', 'gentlem', 'genuin', 'geny', 'georg', 'ger', 'gerard', 'germ', 'germany', 'gershwin', 'gest', 'get', 'ghetto', 'ghost', 'giallo', 'giant', 'gibson', 'gielgud', 'gift', 'gig', 'giggl', 'gil', 'gilbert', 'gilliam', 'gimmick', 'gin', 'ging', 'giovann', 'girl', 'girlfriend', 'giv', 'glad', 'glady', 'glam', 'glant', 'glar', 'glass', 'gle', 'glen', 'glimps', 'glob', 'gloom', 'glor', 'glory', 'glov', 'glow', 'glu', 'go', 'goal', 'goat', 'god', 'godard', 'godfath', 'godzill', 'goe', 'goer', 'going', 'gold', 'goldberg', 'goldbl', 'goldsworthy', 'goldy', 'gon', 'gonn', 'good', 'goodby', 'goodm', 'goody', 'goof', 'goofy', 'gor', 'gordon', 'gorg', 'gory', 'gosh', 'got', 'goth', 'gott', 'govern', 'govind', 'grab', 'grac', 'grad', 'gradu', 'graham', 'grainy', 'gram', 'grand', 'grandfath', 'grandm', 'grandmoth', 'grandp', 'grant', 'graph', 'grasp', 'grass', 'grat', 'gratuit', 'grav', 'graveyard', 'gray', 'grayson', 'gre', 'great', 'greatest', 'gree', 'greedy', 'greek', 'green', 'greet', 'greg', 'grew', 'grey', 'grief', 'griev', 'griffi', 'grim', 'grin', 'grinch', 'grind', 'grip', 'gritty', 'gro', 'gross', 'grotesqu', 'ground', 'group', 'grow', 'grown', 'grudg', 'gruesom', 'guar', 'guarantee', 'guard', 'guess', 'guest', 'guid', 'guil', 'guilt', 'guin', 'guine', 'guit', 'gum', 'gun', 'gundam', 'gunfight', 'gung', 'gut', 'guy', 'gwyne', 'gypo', 'gypsy', 'ha', 'habit', 'hack', 'hackm', 'hackney', 'hadley', 'hag', 'hail', 'hain', 'hair', 'hal', 'half', 'halfway', 'hallmark', 'halloween', 'hallucin', 'ham', 'hamilton', 'hamlet', 'hammy', 'han', 'hand', 'handicap', 'handl', 'handsom', 'hang', 'hank', 'hannah', 'hap', 'hapless', 'happy', 'har', 'harass', 'harb', 'hard', 'hardc', 'hardy', 'hark', 'harlow', 'harm', 'harmless', 'harold', 'harp', 'harriet', 'harrison', 'harrow', 'harry', 'harsh', 'hart', 'hartley', 'harvey', 'hat', 'hatch', 'haunt', 'havoc', 'hawk', 'hawn', 'hay', 'haywor', 'hbo', 'hea', 'head', 'headach', 'heal', 'healthy', 'heap', 'hear', 'heard', 'heart', 'heartbreak', 'heartfelt', 'heartwarm', 'heat', 'heav', 'heavy', 'heck', 'hect', 'heel', 'height', 'heist', 'hel', 'held', 'helicopt', 'hello', 'helm', 'helmet', 'help', 'helpless', 'henchm', 'henry', 'hent', 'hepburn', 'her', 'herbert', 'herd', 'here', 'herm', 'hero', 'heroin', 'hesit', 'heston', 'hey', 'hi', 'hick', 'hid', 'high', 'highest', 'highlight', 'highway', 'hil', 'him', 'hind', 'hint', 'hip', 'hippy', 'hir', 'hist', 'hit', 'hitch', 'hitchcock', 'hitl', 'hk', 'hmmm', 'ho', 'hoffm', 'hog', 'hokey', 'hol', 'hold', 'holiday', 'hollow', 'hollywood', 'holm', 'holocaust', 'holy', 'hom', 'homeless', 'homicid', 'homosex', 'hon', 'honest', 'honesty', 'hong', 'hono', 'hood', 'hook', 'hoop', 'hoot', 'hop', 'hopeless', 'hopkin', 'hor', 'horn', 'horny', 'horr', 'horrend', 'horrid', 'hors', 'hospit', 'host', 'hostel', 'hostil', 'hot', 'hotel', 'hound', 'hour', 'hous', 'household', 'housew', 'how', 'howard', 'howev', 'howl', 'http', 'hudson', 'hug', 'hugh', 'huh', 'hulk', 'hum', 'humbl', 'humo', 'humy', 'hundr', 'hung', 'hungry', 'hunt', 'hurry', 'hurt', 'husband', 'hustl', 'huston', 'hybrid', 'hyd', 'hyp', 'hypnot', 'hyst', 'ian', 'ic', 'icon', 'id', 'ide', 'idea', 'ident', 'ideolog', 'idiot', 'idol', 'ie', 'if', 'ign', 'ii', 'il', 'illeg', 'illog', 'illud', 'illust', 'im', 'imagery', 'imagin', 'imdb', 'imit', 'immedy', 'immens', 'immers', 'immigr', 'immort', 'imo', 'imp', 'impact', 'impecc', 'imperson', 'impl', 'implaus', 'imply', 'import', 'impos', 'imposs', 'impress', 'imprison', 'improb', 'improv', 'impuls', 'in', 'inacc', 'inadvert', 'inappropry', 'incap', 'incarn', 'incest', 'inch', 'incid', 'inclin', 'includ', 'incoh', 'incompet', 'incomprehens', 'inconsist', 'incorp', 'incorrect', 'increas', 'incred', 'ind', 'indee', 'independ', 'indian', 'indiff', 'individ', 'induc', 'indulg', 'indust', 'industry', 'indy', 'inept', 'inevit', 'inexpery', 'inexpl', 'inf', 'infam', 'infect', 'infery', 'infinit', 'inflict', 'influ', 'info', 'inform', 'ing', 'ingeny', 'ingredy', 'ingrid', 'inh', 'inhabit', 'inherit', 'init', 'inject', 'injury', 'injust', 'inm', 'innoc', 'innov', 'innuendo', 'ins', 'insec', 'insect', 'insert', 'insid', 'insight', 'insign', 'insipid', 'insist', 'insomn', 'inspect', 'inspir', 'inst', 'instal', 'instead', 'instinct', 'institut', 'instru', 'instruct', 'insult', 'int', 'intact', 'integr', 'intellect', 'intellig', 'intend', 'intens', 'interact', 'interest', 'interf', 'intern', 'internet', 'interpret', 'interrupt', 'intertwin', 'interv', 'interview', 'intery', 'intim', 'intol', 'intrigu', 'intro', 'introduc', 'intrud', 'inv', 'invad', 'invas', 'invest', 'investig', 'invis', 'invit', 'involv', 'iq', 'ir', 'iraq', 'ireland', 'iron', 'irony', 'irrelev', 'irrit', 'is', 'isabel', 'ish', 'islam', 'island', 'isol', 'israel', 'issu', 'it', 'ita', 'item', 'iturb', 'iv', 'jack', 'jacket', 'jackson', 'jacky', 'jacob', 'jacqu', 'jad', 'jaff', 'jag', 'jail', 'jak', 'jam', 'jamy', 'jan', 'jap', 'japanes', 'jar', 'jason', 'jaw', 'jay', 'jazz', 'jeal', 'jealousy', 'jean', 'jed', 'jeff', 'jeffrey', 'jen', 'jenn', 'jenny', 'jeremy', 'jerk', 'jerry', 'jersey', 'jes', 'jess', 'jessic', 'jet', 'jew', 'jewel', 'jil', 'jim', 'jimmy', 'joan', 'job', 'jock', 'jody', 'joe', 'joel', 'joey', 'johansson', 'john', 'johnny', 'johnson', 'join', 'joint', 'jok', 'joly', 'jon', 'jonath', 'jord', 'jos', 'joseph', 'josh', 'journ', 'journey', 'jov', 'joy', 'jr', 'juan', 'jud', 'judg', 'judy', 'juic', 'jul', 'juliet', 'july', 'jump', 'jun', 'jungl', 'junk', 'juny', 'jury', 'just', 'justin', 'juvenil', 'kan', 'kansa', 'kapo', 'kar', 'karl', 'karloff', 'kat', 'kathleen', 'kathryn', 'kathy', 'katy', 'kay', 'kaz', 'keaton', 'keen', 'keep', 'kei', 'kel', 'ken', 'kenne', 'kennedy', 'kent', 'kept', 'kevin', 'key', 'khan', 'kick', 'kid', 'kiddy', 'kidm', 'kidnap', 'kil', 'kim', 'kind', 'king', 'kingdom', 'kinnear', 'kirk', 'kiss', 'kit', 'kitch', 'kitty', 'klin', 'kne', 'knew', 'knif', 'knight', 'knightley', 'knock', 'know', 'knowledg', 'known', 'kolchak', 'kong', 'kor', 'kore', 'kri', 'kubrick', 'kudo', 'kum', 'kung', 'kurosaw', 'kurt', 'kyl', 'la', 'lab', 'label', 'lac', 'lack', 'lacklust', 'lad', 'lady', 'laid', 'lak', 'lam', 'lamb', 'lampoon', 'lan', 'land', 'landmark', 'landscap', 'lang', 'langu', 'lant', 'laput', 'lar', 'larg', 'larry', 'las', 'last', 'lat', 'latest', 'latin', 'latino', 'laugh', 'laught', 'launch', 'laur', 'laurel', 'laury', 'lav', 'law', 'lawr', 'lawy', 'lay', 'lazy', 'le', 'lead', 'leagu', 'lean', 'leap', 'learn', 'least', 'leath', 'leav', 'lect', 'led', 'lee', 'left', 'leg', 'legend', 'legitim', 'leigh', 'lemmon', 'len', 'lend', 'leng', 'lengthy', 'lennon', 'leo', 'leon', 'leonard', 'les', 'lesb', 'less', 'lesson', 'lest', 'let', 'leth', 'lev', 'level', 'lew', 'lex', 'li', 'liam', 'lib', 'liberty', 'libr', 'licens', 'lie', 'lif', 'life', 'lifeless', 'lifestyl', 'lifetim', 'lift', 'light', 'lightn', 'lik', 'likew', 'lil', 'lily', 'limb', 'limit', 'lin', 'lincoln', 'lind', 'lindsay', 'linear', 'ling', 'link', 'lion', 'lionel', 'lip', 'lis', 'list', 'lit', 'littl', 'liu', 'liv', 'liz', 'lizard', 'lloyd', 'load', 'loath', 'loc', 'lock', 'log', 'loi', 'lol', 'lon', 'london', 'long', 'longest', 'longor', 'look', 'loom', 'loop', 'loos', 'lor', 'lord', 'lorett', 'los', 'loss', 'lost', 'lot', 'lou', 'loud', 'lousy', 'lov', 'love', 'low', 'lowest', 'loy', 'loyal', 'luc', 'luca', 'lucil', 'luck', 'lucky', 'lucy', 'ludicr', 'lugos', 'lui', 'luk', 'luka', 'lumet', 'lun', 'lunch', 'lundgr', 'lung', 'lur', 'lurk', 'lush', 'lust', 'luth', 'luxury', 'lying', 'lynch', 'lyr', 'mabel', 'mac', 'macabr', 'macarth', 'macdonald', 'machin', 'macho', 'macy', 'mad', 'made', 'madm', 'madonn', 'mads', 'mae', 'maf', 'mag', 'magazin', 'maggy', 'magn', 'maid', 'mail', 'main', 'mainstream', 'maintain', 'maj', 'mak', 'makeup', 'mal', 'malon', 'mam', 'man', 'mand', 'mandy', 'mang', 'manhat', 'maniac', 'manifest', 'manip', 'mankind', 'manufact', 'many', 'map', 'mar', 'marc', 'march', 'margaret', 'margin', 'marilyn', 'marin', 'mario', 'mark', 'market', 'marl', 'marlon', 'marqu', 'marry', 'marsh', 'marshal', 'mart', 'marth', 'martin', 'marty', 'marvel', 'marx', 'mary', 'mask', 'masoch', 'mason', 'mass', 'massacr', 'mast', 'masterpiec', 'masterson', 'masturb', 'mat', 'match', 'mathieu', 'matrix', 'matthau', 'matthew', 'maureen', 'max', 'maxim', 'may', 'mayb', 'mayhem', 'mccarthy', 'mccoy', 'mclaglen', 'mcqueen', 'me', 'meadow', 'meal', 'mean', 'meand', 'meaningless', 'meant', 'meantim', 'meanwhil', 'meas', 'meat', 'mech', 'med', 'medicin', 'mediev', 'mediocr', 'medit', 'meek', 'meet', 'meg', 'mel', 'meliss', 'melodram', 'melody', 'melt', 'melvyn', 'mem', 'memb', 'men', 'menac', 'ment', 'mer', 'merc', 'merciless', 'mercy', 'merit', 'mermaid', 'merry', 'meryl', 'mesm', 'mess', 'messy', 'met', 'metaph', 'method', 'mex', 'mexico', 'mey', 'mgm', 'miam', 'mic', 'mich', 'michael', 'michel', 'mick', 'mickey', 'mid', 'middl', 'midget', 'midl', 'midnight', 'midst', 'might', 'mighty', 'miik', 'mik', 'mil', 'mild', 'mildr', 'milit', 'milk', 'millionair', 'milo', 'mim', 'min', 'mind', 'mindless', 'minim', 'minisery', 'minnell', 'minut', 'mir', 'mirac', 'mirand', 'mis', 'miscast', 'misery', 'misfit', 'misfortun', 'misguid', 'mislead', 'miss', 'missil', 'mist', 'mistak', 'mistress', 'misunderstand', 'misunderstood', 'mitch', 'mitchel', 'mix', 'mixt', 'miyazak', 'mm', 'mob', 'mobl', 'mobst', 'mock', 'mod', 'model', 'modern', 'modest', 'modesty', 'moe', 'mol', 'molest', 'mom', 'moment', 'mon', 'money', 'monit', 'monk', 'monkey', 'monolog', 'monoton', 'monst', 'mont', 'montan', 'month', 'monty', 'monu', 'mood', 'moody', 'moon', 'moor', 'mor', 'morbid', 'more', 'moreov', 'morg', 'mormon', 'morn', 'moron', 'mort', 'moss', 'most', 'mot', 'moth', 'motorcyc', 'mou', 'mount', 'mountain', 'mourn', 'mous', 'mouth', 'mov', 'movie', 'movies', 'movy', 'mr', 'mrs', 'ms', 'mst', 'mtv', 'much', 'muddl', 'mug', 'mult', 'multipl', 'mum', 'mummy', 'mund', 'muppet', 'murd', 'murky', 'murph', 'murray', 'mus', 'musc', 'muse', 'muslim', 'must', 'mut', 'mutil', 'myer', 'myrtl', 'myst', 'mystery', 'myth', 'mytholog', 'nad', 'nail', 'naiv', 'nak', 'nam', 'nant', 'nar', 'narrow', 'naschy', 'nasty', 'nat', 'nata', 'natal', 'nath', 'naughty', 'naus', 'navy', 'naz', 'nbc', 'nd', 'near', 'nearby', 'neat', 'necess', 'neck', 'ned', 'nee', 'needless', 'neg', 'neglect', 'neighb', 'neighbo', 'neil', 'neith', 'nelson', 'nemes', 'neo', 'nephew', 'nerd', 'nerv', 'net', 'netflix', 'network', 'neurot', 'neut', 'nev', 'nevertheless', 'new', 'newcom', 'newm', 'newspap', 'next', 'nic', 'nichola', 'nicholson', 'nick', 'nicol', 'nicola', 'niec', 'night', 'nightclub', 'nightm', 'nin', 'ninj', 'niro', 'niv', 'no', 'nobl', 'nobody', 'nod', 'noir', 'nois', 'nol', 'nolt', 'nomin', 'non', 'nonetheless', 'nonsens', 'nop', 'nor', 'norm', 'northam', 'northern', 'nos', 'nostalg', 'not', 'notch', 'noteworthy', 'noth', 'novak', 'novel', 'now', 'nowaday', 'nowh', 'nuant', 'nuclear', 'nud', 'num', 'numb', 'nun', 'nurs', 'nut', 'ny', 'nyc', 'object', 'oblig', 'obnoxy', 'obsc', 'observ', 'obsess', 'obstac', 'obtain', 'obvy', 'oc', 'occ', 'occas', 'occult', 'occup', 'occupy', 'occur', 'octob', 'od', 'odyssey', 'off', 'offb', 'offend', 'oft', 'oh', 'oil', 'ok', 'okay', 'ol', 'old', 'oldest', 'oliv', 'olivy', 'olymp', 'om', 'omin', 'omit', 'on', 'one', 'onlin', 'onto', 'op', 'oper', 'opin', 'oppon', 'opportun', 'oppos', 'opposit', 'oppress', 'opt', 'optim', 'or', 'orang', 'orchest', 'ord', 'ordin', 'org', 'orgy', 'origin', 'orl', 'orph', 'orson', 'ory', 'osc', 'oth', 'othello', 'otherw', 'otto', 'ought', 'out', 'outcom', 'outd', 'outdo', 'outfit', 'outland', 'outlaw', 'outlin', 'outright', 'outsid', 'outstand', 'ov', 'over', 'overact', 'overal', 'overblown', 'overboard', 'overcom', 'overdon', 'overlong', 'overlook', 'overshadow', 'overt', 'overwhelm', 'ow', 'owl', 'own', 'oz', 'pac', 'pacino', 'pack', 'pad', 'pag', 'paid', 'pain', 'paint', 'pair', 'pal', 'palac', 'palestin', 'palm', 'paltrow', 'pamel', 'pan', 'pant', 'pap', 'par', 'parad', 'parallel', 'paramount', 'parano', 'paranoid', 'park', 'parody', 'parrot', 'parson', 'part', 'particip', 'particul', 'partn', 'party', 'pass', 'passeng', 'past', 'pat', 'patch', 'path', 'pathet', 'patho', 'patric', 'patrick', 'patriot', 'patron', 'pattern', 'paty', 'pau', 'paul', 'paus', 'paxton', 'pay', 'paycheck', 'payoff', 'pc', 'peac', 'peak', 'pearl', 'peck', 'peculi', 'pedest', 'pee', 'peer', 'peg', 'pen', 'penelop', 'penguin', 'penny', 'peopl', 'people', 'pep', 'per', 'perc', 'perceiv', 'perfect', 'perform', 'perhap', 'peril', 'period', 'perm', 'permit', 'perpet', 'perry', 'person', 'perspect', 'persuad', 'pervers', 'pervert', 'pet', 'petty', 'pfeiff', 'pg', 'phantasm', 'phantom', 'phas', 'phenom', 'phenomenon', 'phil', 'philip', 'phillip', 'philosoph', 'phoenix', 'phon', 'phony', 'photo', 'photograph', 'phrase', 'phys', 'piano', 'pick', 'pickford', 'pict', 'pie', 'piec', 'pier', 'pierc', 'pig', 'pil', 'pilot', 'pin', 'pink', 'pion', 'pip', 'pir', 'pistol', 'pit', 'pitch', 'pity', 'pivot', 'pix', 'plac', 'place', 'plagu', 'plain', 'plan', 'planet', 'plant', 'plast', 'plat', 'platform', 'plaus', 'play', 'playboy', 'playwright', 'pleas', 'please', 'plenty', 'plight', 'plod', 'plot', 'plu', 'plug', 'plum', 'pocket', 'poe', 'poem', 'poet', 'poetry', 'poign', 'point', 'pointless', 'poison', 'pok', 'pokemon', 'pol', 'polansk', 'policem', 'policy', 'polit', 'pond', 'pool', 'poor', 'pop', 'popcorn', 'popul', 'porn', 'porno', 'pornograph', 'port', 'portrait', 'portray', 'pos', 'posey', 'posit', 'poss', 'possess', 'post', 'pot', 'pound', 'pour', 'poverty', 'pow', 'powel', 'pra', 'pract', 'prank', 'pray', 'pre', 'preach', 'preachy', 'prec', 'precy', 'pred', 'predecess', 'predict', 'pref', 'prefer', 'pregn', 'prejud', 'prem', 'premy', 'prep', 'prepost', 'prequel', 'pres', 'preserv', 'presid', 'press', 'preston', 'presum', 'pretend', 'pretenty', 'pretty', 'prev', 'prevail', 'preview', 'prevy', 'prey', 'pri', 'pric', 'priceless', 'prid', 'priest', 'prim', 'primit', 'princess', 'princip', 'principl', 'print', 'prison', 'priv', 'privileg', 'priz', 'pro', 'prob', 'problem', 'proc', 'process', 'proclaim', 'produc', 'prof', 'profess', 'profil', 'profit', 'profound', 'program', 'progress', 'project', 'prolog', 'prom', 'promin', 'promot', 'prompt', 'pronount', 'proof', 'prop', 'propagand', 'property', 'prophecy', 'prophet', 'proport', 'propos', 'prosecut', 'prospect', 'prostitut', 'protagon', 'protect', 'protest', 'proud', 'prov', 'provid', 'provoc', 'provok', 'ps', 'pseudo', 'psych', 'psycho', 'psycholog', 'psychopa', 'psychot', 'psychy', 'pub', 'publ', 'puerto', 'pul', 'pulp', 'pumba', 'pump', 'pun', 'punch', 'punk', 'puppet', 'puppy', 'pur', 'purchas', 'purpl', 'purpos', 'pursu', 'pursuit', 'push', 'put', 'puzzl', 'python', 'quaid', 'qual', 'quant', 'quart', 'quas', 'queen', 'quentin', 'quest', 'quick', 'quiet', 'quin', 'quintess', 'quirky', 'quit', 'quot', 'rabbit', 'rac', 'rachel', 'rack', 'rad', 'radio', 'rady', 'rag', 'raid', 'rail', 'rain', 'rainy', 'rais', 'raj', 'ralph', 'ram', 'rambl', 'rambo', 'ramon', 'ramp', 'ran', 'ranch', 'randolph', 'random', 'randy', 'rang', 'rank', 'rant', 'rao', 'rap', 'rapid', 'rapt', 'rar', 'rat', 'rath', 'ratso', 'rav', 'raw', 'ray', 'raymond', 'raz', 'rd', 'rea', 'reach', 'react', 'read', 'ready', 'real', 'really', 'realm', 'rear', 'reason', 'rebel', 'rec', 'recal', 'receiv', 'recit', 'reckless', 'recogn', 'recognit', 'recommend', 'record', 'recov', 'recr', 'recruit', 'recyc', 'red', 'redeem', 'redempt', 'redneck', 'reduc', 'redund', 'ree', 'reel', 'reev', 'ref', 'refer', 'reflect', 'refresh', 'refug', 'refus', 'reg', 'regain', 'regard', 'regardless', 'regim', 'regret', 'regul', 'rehash', 'rehears', 'reid', 'reign', 'reincarn', 'reinforc', 'reis', 'reject', 'rel', 'relax', 'releas', 'relentless', 'relev', 'reliev', 'relig', 'religy', 'reluct', 'rely', 'remad', 'remain', 'remak', 'remark', 'rememb', 'remind', 'reminisc', 'remot', 'remov', 'ren', 'renaiss', 'rend', 'rendit', 'rent', 'rep', 'repetit', 'replac', 'replay', 'reply', 'report', 'repr', 'repres', 'repress', 'republ', 'repuls', 'reput', 'request', 'requir', 'rerun', 'res', 'rescu', 'research', 'resembl', 'reserv', 'resid', 'resist', 'resolv', 'reson', 'resort', 'resourc', 'respect', 'respond', 'respons', 'rest', 'resta', 'restrain', 'restraint', 'restrict', 'result', 'resum', 'resurrect', 'ret', 'retain', 'retard', 'retir', 'retriev', 'retrospect', 'return', 'reun', 'reunit', 'rev', 'revel', 'reveng', 'revers', 'review', 'revisit', 'revolt', 'revolv', 'reward', 'rewrit', 'rex', 'reynold', 'rhym', 'rhythm', 'ric', 'rich', 'richard', 'richardson', 'rick', 'ricky', 'rid', 'riddl', 'ridic', 'riff', 'rifl', 'rig', 'right', 'ring', 'riot', 'rip', 'ripoff', 'ris', 'risk', 'rit', 'ritchy', 'riv', 'rivet', 'road', 'roam', 'roar', 'rob', 'robbery', 'robbin', 'robby', 'robert', 'robertson', 'robin', 'robinson', 'robot', 'rochest', 'rock', 'rocket', 'rocky', 'rod', 'rog', 'rohm', 'rol', 'rom', 'romeo', 'romero', 'romp', 'ron', 'ronald', 'ronny', 'roof', 'rooky', 'room', 'rooney', 'root', 'rop', 'ros', 'rosario', 'rosem', 'ross', 'rot', 'roth', 'rough', 'round', 'rous', 'rout', 'routin', 'row', 'rowland', 'roy', 'rub', 'ruby', 'rud', 'rug', 'ruin', 'rukh', 'rul', 'rum', 'run', 'runaway', 'rur', 'rush', 'russ', 'russel', 'ruth', 'ruthless', 'ryan', 'sabot', 'sabrin', 'sack', 'sacr', 'sad', 'saddl', 'saf', 'sag', 'said', 'sail', 'saint', 'sak', 'sal', 'salesm', 'salm', 'saloon', 'salt', 'salv', 'sam', 'samanth', 'sammo', 'samura', 'san', 'sand', 'sandl', 'sandr', 'sang', 'sant', 'sap', 'sappy', 'sar', 'sarah', 'sarandon', 'sarcasm', 'sarcast', 'sassy', 'sat', 'satir', 'satisfact', 'satisfy', 'saturday', 'sav', 'saw', 'say', 'scal', 'scan', 'scand', 'scar', 'scarc', 'scarecrow', 'scarfac', 'scariest', 'scarlet', 'scary', 'scat', 'scen', 'scenario', 'scenery', 'schedule', 'scheme', 'schlock', 'schneider', 'school', 'schools', 'sci', 'scif', 'scooby', 'scoop', 'scop', 'scor', 'scorses', 'scot', 'scotland', 'scratch', 'scream', 'screaming', 'screams', 'screen', 'screening', 'screenplay', 'screens', 'screenwriter', 'screenwriters', 'screw', 'screwball', 'screwed', 'script', 'scripted', 'scripting', 'scripts', 'scrooge', 'se', 'sea', 'seag', 'seal', 'sean', 'search', 'season', 'seat', 'sebast', 'sec', 'second', 'secret', 'sect', 'seduc', 'see', 'seedy', 'seek', 'seem', 'seen', 'seg', 'sel', 'seldom', 'select', 'self', 'sem', 'sen', 'send', 'sens', 'senseless', 'sensit', 'sent', 'sentinel', 'senty', 'seny', 'sep', 'septemb', 'sequ', 'sequel', 'ser', 'serb', 'serg', 'serv', 'sery', 'sess', 'set', 'settl', 'setup', 'sev', 'seventy', 'sew', 'sex', 'sexy', 'seymo', 'sf', 'sg', 'sgt', 'sh', 'shad', 'shadow', 'shaggy', 'shah', 'shahid', 'shak', 'shakespear', 'shaky', 'shal', 'shallow', 'sham', 'shameless', 'shangha', 'shap', 'shar', 'shark', 'sharon', 'sharp', 'shat', 'shav', 'shaw', 'she', 'shed', 'sheen', 'sheet', 'shel', 'shelf', 'shelley', 'shelt', 'shepard', 'shepherd', 'sheriff', 'shield', 'shift', 'shin', 'ship', 'shirley', 'shirt', 'sho', 'shock', 'shoddy', 'shoot', 'shootout', 'shop', 'shor', 'short', 'shortcom', 'shot', 'shotgun', 'should', 'shout', 'shov', 'show', 'showcas', 'showdown', 'shown', 'shut', 'shy', 'sibl', 'sick', 'sid', 'sidekick', 'sidewalk', 'sidney', 'sigh', 'sight', 'sign', 'sil', 'silv', 'sim', 'simil', 'simmon', 'simon', 'simpl', 'simply', 'simpson', 'simult', 'sin', 'sinatr', 'sing', 'singl', 'sink', 'sint', 'sir', 'sirk', 'sissy', 'sist', 'sit', 'sitcom', 'situ', 'six', 'sixteen', 'sixty', 'siz', 'skat', 'skept', 'sketch', 'ski', 'skil', 'skin', 'skinny', 'skip', 'skit', 'skul', 'sky', 'slack', 'slam', 'slap', 'slapstick', 'slash', 'slat', 'slaught', 'slav', 'slay', 'sleaz', 'sleazy', 'sleep', 'sleepwalk', 'slic', 'slick', 'slid', 'slight', 'slightest', 'slim', 'slimy', 'slip', 'slo', 'sloppy', 'slow', 'slug', 'slut', 'sly', 'smack', 'smal', 'smart', 'smash', 'smel', 'smi', 'smil', 'smok', 'smoo', 'smug', 'smuggl', 'snak', 'snap', 'snatch', 'sneak', 'snip', 'snl', 'snob', 'snow', 'snowm', 'snuff', 'so', 'soap', 'sob', 'soc', 'socc', 'socy', 'soderbergh', 'soft', 'sol', 'sold', 'soldy', 'solid', 'solo', 'solv', 'somebody', 'someday', 'somehow', 'someon', 'someth', 'sometim', 'somewh', 'son', 'sondr', 'song', 'sonny', 'soon', 'soph', 'soprano', 'sor', 'sorrow', 'sorry', 'sort', 'sou', 'sought', 'soul', 'sound', 'soundtrack', 'soup', 'sour', 'sourc', 'southern', 'soviet', 'sox', 'soyl', 'spac', 'spacey', 'spad', 'spaghett', 'spain', 'span', 'spar', 'spark', 'sparkl', 'spawn', 'speak', 'spear', 'spec', 'spect', 'spectac', 'spectacul', 'specy', 'spee', 'speech', 'spel', 'spend', 'spent', 'spi', 'spic', 'spid', 'spielberg', 'spik', 'spil', 'spin', 'spir', 'spirit', 'spit', 'splatter', 'splendid', 'split', 'spock', 'spoil', 'spok', 'spont', 'spoof', 'spooky', 'spoon', 'sport', 'spot', 'spotlight', 'spout', 'spread', 'spree', 'spring', 'springer', 'spy', 'squ', 'squad', 'squeez', 'st', 'stab', 'stabl', 'stack', 'stad', 'staff', 'stag', 'stair', 'stak', 'stal', 'stalk', 'stallon', 'stamp', 'stan', 'stand', 'standard', 'standout', 'stanley', 'stant', 'stanwyck', 'star', 'stardom', 'stardust', 'starg', 'stark', 'start', 'startl', 'starv', 'stat', 'statu', 'stay', 'ste', 'steady', 'steam', 'steel', 'stell', 'step', 'steph', 'stephany', 'stereotyp', 'sterl', 'stern', 'stev', 'stewart', 'stick', 'stiff', 'stil', 'stilt', 'stim', 'stink', 'stir', 'stock', 'stol', 'stomach', 'ston', 'stood', 'stoog', 'stop', 'stor', 'storm', 'story', 'storylin', 'storytel', 'straight', 'straightforward', 'stranded', 'strange', 'strangely', 'stranger', 'strangers', 'stream', 'streep', 'street', 'streets', 'streisand', 'strength', 'strengths', 'stress', 'stretch', 'stretched', 'strict', 'strictly', 'strike', 'strikes', 'striking', 'string', 'strings', 'strip', 'stroke', 'strong', 'stronger', 'strongest', 'strongly', 'struck', 'structure', 'struggle', 'struggles', 'struggling', 'stuart', 'stuck', 'stud', 'studio', 'study', 'stuff', 'stumbl', 'stun', 'stunt', 'stupid', 'styl', 'sub', 'subject', 'sublim', 'submarin', 'submit', 'subplot', 'subsequ', 'subst', 'substitut', 'subt', 'subtext', 'subtitl', 'subtl', 'suburb', 'subvert', 'subway', 'success', 'suck', 'sud', 'sue', 'suff', 'sufficy', 'sug', 'suggest', 'suicid', 'suit', 'sul', 'sum', 'summ', 'sun', 'sund', 'sunday', 'sung', 'sunk', 'sunny', 'sunr', 'sunset', 'sunshin', 'sup', 'superb', 'superbl', 'superf', 'superhero', 'superm', 'supern', 'superst', 'supery', 'supply', 'support', 'suppos', 'suppress', 'suprem', 'sur', 'surf', 'surfac', 'surgery', 'surpass', 'surpr', 'surrend', 'surround', 'surv', 'sus', 'susp', 'suspect', 'suspend', 'suspens', 'suspicy', 'sustain', 'sutherland', 'swallow', 'sway', 'swe', 'swear', 'swed', 'sweep', 'sweet', 'swept', 'swift', 'swim', 'swing', 'switch', 'sword', 'sydney', 'symbol', 'sympath', 'sympathet', 'sympathy', 'syndrom', 'synops', 'system', 'tabl', 'taboo', 'tack', 'tackl', 'tacky', 'tact', 'tad', 'tag', 'tail', 'tak', 'tal', 'talk', 'talky', 'tam', 'tang', 'tank', 'tap', 'tar', 'tarantino', 'target', 'tarz', 'task', 'tast', 'tasteless', 'tat', 'tattoo', 'taught', 'tax', 'tayl', 'tcm', 'tea', 'teach', 'team', 'tear', 'teas', 'tech', 'techn', 'technicol', 'technolog', 'ted', 'tedy', 'tee', 'teen', 'tel', 'televid', 'temp', 'templ', 'tempt', 'ten', 'tend', 'tens', 'tent', 'ter', 'term', 'termin', 'terr', 'territ', 'terry', 'test', 'testa', 'texa', 'text', 'th', 'thank', 'that', 'the', 'thelm', 'them', 'therapy', 'there', 'theref', 'thick', 'thief', 'thiev', 'thin', 'thing', 'think', 'thinking', 'third', 'thirty', 'this', 'tho', 'thoma', 'thompson', 'thorn', 'thorough', 'though', 'thought', 'thousand', 'thread', 'threat', 'threatened', 'threatening', 'threatens', 'three', 'threw', 'thrill', 'thrilled', 'thriller', 'thrillers', 'thrilling', 'thrills', 'throat', 'throughout', 'throw', 'throwing', 'thrown', 'throws', 'thru', 'thu', 'thug', 'thumb', 'thund', 'thunderbird', 'thurm', 'tick', 'ticket', 'tid', 'tie', 'tied', 'tierney', 'tig', 'tight', 'til', 'tim', 'timberlak', 'time', 'timeless', 'timmy', 'timon', 'timothy', 'tin', 'tiny', 'tip', 'tir', 'tiresom', 'tit', 'titl', 'toby', 'tod', 'today', 'toe', 'togeth', 'toilet', 'tok', 'tokyo', 'tol', 'told', 'tom', 'tomato', 'tomb', 'tome', 'tommy', 'tomorrow', 'ton', 'tongu', 'tonight', 'tony', 'too', 'took', 'tool', 'top', 'topless', 'tor', 'torch', 'torn', 'toronto', 'tort', 'toss', 'tot', 'touch', 'tough', 'tour', 'tow', 'toward', 'town', 'toy', 'trac', 'track', 'tracy', 'trad', 'trademark', 'tradit', 'traff', 'trag', 'tragedy', 'trail', 'train', 'trait', 'tramp', 'transcend', 'transf', 'transform', 'transit', 'transl', 'transmit', 'transp', 'transpl', 'transport', 'trap', 'trash', 'trashy', 'traum', 'trav', 'travel', 'travesty', 'tre', 'treas', 'trek', 'tremend', 'trend', 'tri', 'triangl', 'trib', 'tribut', 'trick', 'trig', 'trilog', 'trio', 'trip', 'tripl', 'trit', 'triumph', 'triv', 'trom', 'troop', 'troubl', 'tru', 'truck', 'trum', 'trust', 'truth', 'try', 'tub', 'tuck', 'tun', 'tunnel', 'turd', 'turk', 'turkey', 'turmoil', 'turn', 'turtl', 'tv', 'twelv', 'twenty', 'twic', 'twilight', 'twin', 'twist', 'two', 'tyl', 'typ', 'ug', 'ugh', 'uh', 'uk', 'ultim', 'ultimat', 'ultr', 'um', 'un', 'unansw', 'unattract', 'unaw', 'unbear', 'unbeliev', 'unc', 'uncanny', 'uncomfort', 'unconscy', 'unconv', 'unconvint', 'uncov', 'uncut', 'und', 'undead', 'undeny', 'under', 'underground', 'undermin', 'undernea', 'underst', 'understand', 'understood', 'undertak', 'underw', 'underwear', 'underworld', 'undoubt', 'uneasy', 'unev', 'unexpect', 'unexplain', 'unfair', 'unfold', 'unforg', 'unforget', 'unfortun', 'unfunny', 'unhappy', 'uniform', 'unimagin', 'uninspir', 'unint', 'uninterest', 'unit', 'univers', 'unknown', 'unleash', 'unless', 'unlik', 'unnecess', 'unorigin', 'unpleas', 'unpredict', 'unr', 'unravel', 'unrel', 'uns', 'unsatisfy', 'unseen', 'unsettl', 'unst', 'unsuspect', 'unsympathet', 'unus', 'unw', 'unwatch', 'unwil', 'up', 'upcom', 'upd', 'uplift', 'upon', 'upset', 'upsid', 'urb', 'urg', 'us', 'useless', 'ustinov', 'ut', 'util', 'uw', 'vac', 'vacu', 'vad', 'vagu', 'vain', 'val', 'valentin', 'valid', 'valley', 'valu', 'vampir', 'van', 'vaness', 'vanill', 'vant', 'vary', 'vast', 'vault', 'veg', 'vega', 'vehic', 'vein', 'ven', 'venezuel', 'veng', 'venom', 'vent', 'ver', 'verb', 'verdict', 'verg', 'verhoev', 'vers', 'versatil', 'vert', 'vet', 'vhs', 'via', 'vib', 'vibr', 'vic', 'vict', 'victim', 'victor', 'vicy', 'vid', 'video', 'vietnam', 'view', 'viewpoint', 'vigil', 'vignet', 'vil', 'villain', 'vint', 'viol', 'vir', 'virgin', 'virt', 'virtu', 'vis', 'viscont', 'visit', 'vit', 'viv', 'vivid', 'voc', 'voic', 'void', 'voight', 'volum', 'volunt', 'vomit', 'von', 'vonnegut', 'vot', 'voy', 'vs', 'vulg', 'vuln', 'wacky', 'wag', 'wagn', 'wait', 'waitress', 'wak', 'wal', 'walk', 'wallac', 'walsh', 'walt', 'wan', 'wand', 'wang', 'wann', 'wannab', 'want', 'war', 'ward', 'wardrob', 'warhol', 'warm', 'warn', 'warp', 'warry', 'was', 'wash', 'washington', 'wast', 'wat', 'watch', 'watson', 'wav', 'wax', 'way', 'wayn', 'weak', 'weakest', 'weal', 'wealthy', 'weapon', 'wear', 'weary', 'weath', 'weav', 'web', 'websit', 'wed', 'wee', 'week', 'weekend', 'weight', 'weird', 'wel', 'welcom', 'well', 'wendigo', 'wendy', 'went', 'werewolf', 'werewolv', 'wes', 'west', 'western', 'wet', 'whack', 'whal', 'what', 'whatev', 'whatsoev', 'wheel', 'wheelchair', 'whenev', 'wherea', 'wheth', 'whilst', 'whin', 'whiny', 'whip', 'whistl', 'whit', 'who', 'whoev', 'whol', 'wholesom', 'whoop', 'whor', 'whos', 'why', 'wick', 'wid', 'widescreen', 'widmark', 'widow', 'wield', 'wif', 'wig', 'wil', 'wild', 'william', 'wilson', 'win', 'winchest', 'wind', 'window', 'wing', 'wint', 'wip', 'wir', 'wis', 'wisdom', 'wish', 'wit', 'witch', 'witchcraft', 'within', 'without', 'witty', 'wiv', 'wizard', 'woe', 'wolf', 'wom', 'wond', 'wonderland', 'wong', 'wont', 'woo', 'wood', 'woody', 'wor', 'word', 'work', 'world', 'worm', 'worn', 'worry', 'wors', 'worst', 'worthless', 'worthwhil', 'worthy', 'would', 'wound', 'wow', 'wrap', 'wreck', 'wrench', 'wrestl', 'wretch', 'wright', 'writ', 'wrong', 'wrot', 'wtf', 'ww', 'wwe', 'wwi', 'www', 'ya', 'yank', 'yard', 'yawn', 'ye', 'yeah', 'year', 'yearn', 'years', 'yel', 'yellow', 'yep', 'yesterday', 'yet', 'yoka', 'york', 'you', 'young', 'youngest', 'youngst', 'youth', 'youtub', 'zan', 'zen', 'zero', 'zizek', 'zomb', 'zomby', 'zon', 'zoom', 'zorro']

预测结果

使用随机森林分类器进行分类

from sklearn.ensemble import RandomForestClassifier
forest = RandomForestClassifier(n_estimators = 100) 
forest = forest.fit(train_data_features, train["sentiment"])

输出提交结果

test = pd.read_csv("./data/testData.tsv", header=0, delimiter="\t",quoting=3 )
num_reviews = len(test["review"])
clean_test_reviews = [] 

for i in range(num_reviews):
    clean_review = review_to_words( test["review"][i] )
    clean_test_reviews.append( clean_review )
test_data_features = vectorizer.transform(clean_test_reviews)
test_data_features = test_data_features.toarray()
result = forest.predict(test_data_features)
output = pd.DataFrame( data={ 
   "id":test["id"], "sentiment":result} )
output.to_csv( "Bag_of_Words_model.csv", index=False, quoting=3 )

尝试使用xgb

from xgboost import XGBClassifier
from sklearn.model_selection import train_test_split
X_train, X_test, y_train, y_test = train_test_split(train_data_features, train["sentiment"], test_size=0.2)
model = XGBClassifier()
eval_set = [(X_test, y_test)]
model.fit(X_train, y_train, early_stopping_rounds=10, eval_metric="logloss", eval_set=eval_set, verbose=True)
result = model.predict(test_data_features)
output = pd.DataFrame( data={ 
   "id":test["id"], "sentiment":result} )
output.to_csv( "xgbBag_of_Words_model.csv", index=False, quoting=3 )
[0]	validation_0-logloss:0.676219
Will train until validation_0-logloss hasn't improved in 10 rounds.
[1]	validation_0-logloss:0.662174
[2]	validation_0-logloss:0.651057
[3]	validation_0-logloss:0.640874
[4]	validation_0-logloss:0.632644
[5]	validation_0-logloss:0.625076
[6]	validation_0-logloss:0.617619
[7]	validation_0-logloss:0.611682
[8]	validation_0-logloss:0.605249
[9]	validation_0-logloss:0.599587
[10]	validation_0-logloss:0.594965
[11]	validation_0-logloss:0.589799
[12]	validation_0-logloss:0.585117
[13]	validation_0-logloss:0.580564
[14]	validation_0-logloss:0.576377
[15]	validation_0-logloss:0.572584
[16]	validation_0-logloss:0.568511
[17]	validation_0-logloss:0.565177
[18]	validation_0-logloss:0.561793
[19]	validation_0-logloss:0.558281
[20]	validation_0-logloss:0.55503
[21]	validation_0-logloss:0.552451
[22]	validation_0-logloss:0.549323
[23]	validation_0-logloss:0.546664
[24]	validation_0-logloss:0.544006
[25]	validation_0-logloss:0.54108
[26]	validation_0-logloss:0.538433
[27]	validation_0-logloss:0.535872
[28]	validation_0-logloss:0.533465
[29]	validation_0-logloss:0.5312
[30]	validation_0-logloss:0.528723
[31]	validation_0-logloss:0.526622
[32]	validation_0-logloss:0.524268
[33]	validation_0-logloss:0.522295
[34]	validation_0-logloss:0.519956
[35]	validation_0-logloss:0.518042
[36]	validation_0-logloss:0.515848
[37]	validation_0-logloss:0.514131
[38]	validation_0-logloss:0.512278
[39]	validation_0-logloss:0.510431
[40]	validation_0-logloss:0.508723
[41]	validation_0-logloss:0.506938
[42]	validation_0-logloss:0.505074
[43]	validation_0-logloss:0.50362
[44]	validation_0-logloss:0.501969
[45]	validation_0-logloss:0.500489
[46]	validation_0-logloss:0.499067
[47]	validation_0-logloss:0.497414
[48]	validation_0-logloss:0.496192
[49]	validation_0-logloss:0.494645
[50]	validation_0-logloss:0.493216
[51]	validation_0-logloss:0.49187
[52]	validation_0-logloss:0.490369
[53]	validation_0-logloss:0.489028
[54]	validation_0-logloss:0.487349
[55]	validation_0-logloss:0.486212
[56]	validation_0-logloss:0.485081
[57]	validation_0-logloss:0.483909
[58]	validation_0-logloss:0.482761
[59]	validation_0-logloss:0.481767
[60]	validation_0-logloss:0.480625
[61]	validation_0-logloss:0.479329
[62]	validation_0-logloss:0.478402
[63]	validation_0-logloss:0.477328
[64]	validation_0-logloss:0.476377
[65]	validation_0-logloss:0.475029
[66]	validation_0-logloss:0.473751
[67]	validation_0-logloss:0.472692
[68]	validation_0-logloss:0.471596
[69]	validation_0-logloss:0.470421
[70]	validation_0-logloss:0.469413
[71]	validation_0-logloss:0.468299
[72]	validation_0-logloss:0.467431
[73]	validation_0-logloss:0.466318
[74]	validation_0-logloss:0.465558
[75]	validation_0-logloss:0.464642
[76]	validation_0-logloss:0.463728
[77]	validation_0-logloss:0.462841
[78]	validation_0-logloss:0.46207
[79]	validation_0-logloss:0.461132
[80]	validation_0-logloss:0.460134
[81]	validation_0-logloss:0.45898
[82]	validation_0-logloss:0.458173
[83]	validation_0-logloss:0.457472
[84]	validation_0-logloss:0.456591
[85]	validation_0-logloss:0.456256
[86]	validation_0-logloss:0.455629
[87]	validation_0-logloss:0.454958
[88]	validation_0-logloss:0.454081
[89]	validation_0-logloss:0.453485
[90]	validation_0-logloss:0.452779
[91]	validation_0-logloss:0.452121
[92]	validation_0-logloss:0.45126
[93]	validation_0-logloss:0.450549
[94]	validation_0-logloss:0.450048
[95]	validation_0-logloss:0.44925
[96]	validation_0-logloss:0.448478
[97]	validation_0-logloss:0.447839
[98]	validation_0-logloss:0.447183
[99]	validation_0-logloss:0.446421

还是随机森林好用

在这里插入图片描述

版权声明:本文内容由互联网用户自发贡献,该文观点仅代表作者本人。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如发现本站有涉嫌侵权/违法违规的内容, 请联系我们举报,一经查实,本站将立刻删除。

发布者:全栈程序员-站长,转载请注明出处:https://javaforall.net/143707.html原文链接:https://javaforall.net

(0)
全栈程序员-站长的头像全栈程序员-站长


相关推荐

  • 在firefox中关联ed2k到amule

    在firefox中关联ed2k到amule

    2021年4月29日
    120
  • SheetJS中文文档-js导出Excel脚本库[通俗易懂]

    SheetJS中文文档-js导出Excel脚本库[通俗易懂]转载自GITHUB用户rockboom的翻译文档SheetJs下载:GITHUB地址|CSDN下载地址SheetJSjs-xlsxSheetJS是用于多种电子表格格式的解析器和编写器。通过官方规范、相关文档以及测试文件实现简洁的JS方法。SheetJS强调解析和编写的稳健,其跨格式的特点和统一的JS规范兼容,并且ES3/ES5浏览器向后兼容IE6。目前这个是社区版,我们也提供了性能增强的专业版,专业版提供样式和专业支持的附加功能。

    2022年5月6日
    1.6K
  • DOS攻击工具(dos攻击教程)

    DOS攻击工具(dos攻击教程)DoS(DenialOfService)攻击是指故意的攻击网络协议实现的缺陷或直接通过野蛮手段残忍地耗尽被攻击对象的资源,目的是让目标计算机或网络无法提供正常的服务或资源访问,使目标系统服务系统停止响应甚至崩溃然而随着网络上免费的可用DDoS工具增多,DoS攻击也日益增长,下面介绍几款Hacker常用的DoS攻击工具。特别提示:仅用于攻防演练及教学测试用途,禁止非法使用。1、卢瓦(LO…

    2022年4月18日
    558
  • Android开发前景(海洋药物开发前景)

    一、Android的产生过程和发展1.概念:Android是一种基于Linux的自由及开放源代码的操作系统,现在的主要适用范围一般是为移动端设备,如一类安卓手机和平板电脑。最初的安卓系统由Google公司和开放手机联盟领导及开发,2007年11月,Google与84家硬件制造商、软件开发商及电信营运商组建开放手机联盟共同研发改良Android系统。第一部Android智能手机发布于2008年10月,随后安卓系统也由手机平台逐渐像像平板电脑以及其他领域扩展。2011年第一季度,Android在全球的市

    2022年4月14日
    35
  • mysql b+树优点_基础B

    mysql b+树优点_基础B写在前面大家在面试的时候,肯定都会被问到MySql的知识,以下是面试场景:面试官:对于MySQL,你对他索引原理了解吗?我:了解面试官:MySQL的索引是用什么数据机构的?我:B+树面试官:为什么要用B+树,而不是B树?我:…面试官:用B+树作为MySql的索引结构,用什么好处?我:…B树和B+树是MySQL索引使用的数据结构,对于索引优化和原理理解都非常重要,下面我的写文章就是要把B树,B+树的神秘面纱揭开,让大家在面试的时候碰到这个知识点一往无前,不再成为你的知识盲点!欢迎关注公

    2025年6月3日
    2
  • java 上传文件到服务器_ameqp服务器网址

    java 上传文件到服务器_ameqp服务器网址privateStringsaveImageReturnPath(MultipartFilemultiFile)throwsIllegalStateException,IOException{ StringdateName=PicFileUtil.randomFileName()+multiFile.getOriginalFilename(); …

    2025年9月13日
    5

发表回复

您的邮箱地址不会被公开。 必填项已用 * 标注

关注全栈程序员社区公众号