hadoop - UIMA DUCC vs UIMA on Haoop -
i trying design distributed-scalable pipeline based on uima. how should decide on using uima ducc or uima on hadoop? missing out, if build on uima ducc rather hadoop or vice-versa?
one dimension application characteristics. hadoop have big advantage i/o intensive applications. ducc should have big advantage large memory applications need run multiple pipeline copies in different threads achieve high cpu utilization.
another dimension taking advantage of uima vs taking advantage of hadoop. ducc builds on base uima capabilities, providing many scale out options, built in performance metrics, , debugging support, based on core uima components. more complex uima pipeline bigger advantage ducc; example, complex processing flows can implemented directly in ducc have transformed map-reduce.
for sufficient hadoop expertise, relatively simple uima analytic can integrated existing hadoop shop without having learn uima.
Comments
Post a Comment