csv - Column separation python -


i working on bachelor thesis , working python analyze data. unfortunately not programming expert nor know working python.

i have code seperates columns in csv files comma. want code seperate columns |.

i have tried replace comma in line 58 | not work, surprise surprise. because such noob in programming field, google search did not make sense me @ all. largely appreciated!

from sklearn.feature_extraction.text import countvectorizer sklearn import linear_model import csv import cpickle sklearn.metrics import accuracy_score  def main():     train_file = "train.csv"     test_file  = "test.csv"     # read documents     train_docs, y = read_docs(train_file)      # define features extract (character bigrams in case)     extract = countvectorizer(lowercase=false, ngram_range=(2,2),                                analyzer="char")      extract.fit(train_docs) # create vocabulary training data      # extract features train data     x = extract.transform(train_docs)      # initialize model     model = linear_model.logisticregression()      # train model     model.fit(x, y)      # write model file can reused     cpickle.dump((extract,model),open("model.pickle","w"))       # print coefficients see features important     i,f in enumerate(extract.get_feature_names()):         print f, model.coef_[0][i]      # testing     # read test data     test_docs, y_test = read_docs(test_file)      # extract features test data     x_test = extract.transform(test_docs)      # apply model test data     y_predict = model.predict(x_test)      # evaluation     print accuracy_score(y_test, y_predict)  def read_docs(filename):     '''     return x,y x list of documents , y list of     labels.     '''     x = []     y = []     open(filename) f:         r = csv.reader(f)         row in r:             text,label = row             x.append(text)             y.append(int(label))     return x,y   main() 

at moment got far this:

 csv.register_dialect('pipes', delimiter='|')      open(filename) f:         r = csv.reader(f, dialect ='pipes')         row in r:             text,label = row             x.append(text)             y.append(int(label))     return x,y 

but keep getting error now:

traceback (most recent call last):   file "d:/python/logreggwen.py", line 67, in <module>     main()   file "d:/python/logreggwen.py", line 11, in main     train_docs, y = read_docs(train_file)   file "d:/python/logreggwen.py", line 61, in read_docs     text,label = row valueerror: need more 1 value unpack 

you need tell csv reader delimiter data file uses:

csv.reader(f, delimiter='|') 

but actually, need read corresponding documentation:

https://docs.python.org/2/library/csv.html#examples


Comments

Popular posts from this blog

java - Oracle EBS .ClassNotFoundException: oracle.apps.fnd.formsClient.FormsLauncher.class ERROR -

c# - how to use buttonedit in devexpress gridcontrol -

nvd3.js - angularjs-nvd3-directives setting color in legend as well as in chart elements -