python - corenlp.py is throwing not create the Java Virtual Machine Error -


i trying implement dependency parser using stanford-corenlp-python wrapper.

since using windows platform gives error pexpect package while execution, spend time install wexpect windows , follow steps here , completed setup. when trying execute corenlp.py again, getting following error , program getting terminated. please me this.

traceback (most recent call last):   file "corenlp.py", line 258, in <module>     nlp = stanfordcorenlp()   file "corenlp.py", line 169, in __init__     self.corenlp.expect("done.", timeout=20) # load pos tagger model (~5sec)   file "/cygdrive/f/masters-spring2015/natural language processing/project/stanford-corenlp-python/wexpect.py", line 1356, in expect     return self.expect_list(compiled_pattern_list, timeout, searchwindowsize)   file "/cygdrive/f/masters-spring2015/natural language processing/project/stanford-corenlp-python/wexpect.py", line 1370, in expect_list     return self.expect_loop(searcher_re(pattern_list), timeout, searchwindowsize)   file "/cygdrive/f/masters-spring2015/natural language processing/project/stanford-corenlp-python/wexpect.py", line 1441, in expect_loop     raise eof (str(e) + '\n' + str(self)) wexpect.eof: end of file (eof) in read_nonblocking(). empty string style platform. <wexpect.spawn_unix object @ 0x7fdad40c> version: 2.3 ($revision: 399 $) command: /cygdrive/c/windows/system32/java args: ['/cygdrive/c/windows/system32/java', '-xmx1800m', '-cp', './stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1.jar:./stanford-corenlp-full-2014-08-27/stanford-corenlp-3.4.1-models.jar:./stanford-corenlp-full-2014-08-27/joda-time.jar:./stanford-corenlp-full-2014-08-27/xom.jar:./stanford-corenlp-full-2014-08-27/jollyday.jar', 'edu.stanford.nlp.pipeline.stanfordcorenlp', '-props', 'default.properties'] searcher: searcher_re:     0: re.compile("done.") buffer (last 100 chars): before (last 100 chars):  not create java virtual machine. error: fatal exception has occurred. program exit. after: <class 'wexpect.eof'> match: none match_index: none exitstatus: none flag_eof: true pid: 7104 child_fd: 3 closed: false timeout: 30 delimiter: <class 'wexpect.eof'> logfile: none logfile_read: none logfile_send: none maxread: 2000 ignorecase: false searchwindowsize: none delaybeforesend: 0.05 delayafterclose: 0.1 delayafterterminate: 0.1 

the stanford-corenlp-python wrapper starts server spawns instance of command line interface corenlp, receives sentences on http, pipes sentences spawned cli instance via stdin, takes stdout result, parses it, , sends sentences via http, parses again in process sent request in first place.

the error you're seeing--that wexpect getting unexpected eof--looks spawned instance of corenlp cli crashing. because you're running in cygwin. cygwin looks unix, when needs unix-y things, run other programs, things require genuine interaction operating system, starts turn garbage.

i'm guessing went cygwin because can't use pexpect on windows command line, , stanford-corenlp-python wrapper uses it. pexpect's docs, though:

pexpect works on posix systems, pty module present in standard library. may possible run on windows using cygwin.

it think we're seeing instance here pexpect fails on cygwin.

my recommendation: don't use stanford-corenlp-python wrapper; slow , buggy. after working, else wrong, , after running, drag out processing tremendously.

instead, run corenlp directly command line. if have batch job do, use filelist. output xml; can parse / grep it, , on way. corenlp home page:

if want process list of files use following command line:

java -cp stanford-corenlp-vv.jar:stanford-corenlp-vv-models.jar:xom.jar:joda-time.jar:jollyday.jar:ejml-vv.jar -xmx2g edu.stanford.nlp.pipeline.stanfordcorenlp [ -props <your configuration file> ] -filelist <a file containing list of files> 

let me tell own experience using wrapper. last time used stanford-corenlp-python wrapper, took 17800 documents 2 weeks process. after week or 2 of tweaking code handle large quantity of news articles pass it.

later, got new computer , ran corenlp through command line directly, no wrapper except python , shell scripting create target file list batch jobs. took 24000 documents 15 hours process. new computer helped, anticipated maybe 4x performance gain new hardware. other 26x dropping wrapper.

considering how stanford-corenlp-python wrapper written, it's not surprising it's slow , buggy. between crazy layers of processing does, i'm amazed got working @ all. you're better off, though, running on command line, since won't have run in cygwin.


Comments

Popular posts from this blog

jquery - How do you format the date used in the popover widget title of FullCalendar? -

Bubble Sort Manually a Linked List in Java -

asp.net mvc - SSO between MVCForum and Umbraco7 -