[B! apache-spark][functional-comparison] nabinnoのブックマーク

nabinno id:nabinno

apache-sparkとfunctional-comparisonに関するnabinnoのブックマーク (3)

Spark vs. Hadoop MapReduce: Which big data framework to choose
nabinno 2019/12/27
"Linear processing of huge datasets is the advantage of Hadoop MapReduce, while Spark delivers fast performance, iterative processing, real-time analytics, graph processing, machine learning and more"

sciencesoft

alex-bekker

apache-spark

apache-hadoop

mapreduce

functional-comparison
リンク
What is the difference between Apache Hive and Apache Spark?
nabinno 2019/12/27
quora

apache-hive

apache-spark

distributed-computing

functional-comparison
リンク
What is the difference between HashingTF and CountVectorizer in Spark?
Trying to do doc classification in Spark. I am not sure what the hashing does in HashingTF; does it sacrifice any accuracy? I doubt it, but I don't know. The spark doc says it uses the "hashing trick"... just another example of really bad/confusing naming used by engineers (I'm guilty as well). CountVectorizer also requires setting the vocabulary size, but it has another parameter, a threshold par
nabinno 2019/12/18
stack-overflow

apache-spark

hashingtf

countvectorizer

feature-engineering

pyspark.mllib.feature

functional-comparison
リンク
1

お知らせ

もっと読む

公式Twitter

@HatenaBookmark
リリース、障害情報などのサービスのお知らせ
@hatebu
最新の人気エントリーの配信

キーボードショートカット一覧

j次のブックマーク

k前のブックマーク

lあとで読む

eコメント一覧を開く

oページを開く

設定を変更しましたx