Created by Mario Alemi on 07/04/2017 in El Estrecho, Putumayo, Peru
Created by Mario Alemi on 07/04/2017 in El Estrecho, Putumayo, Peru
conversations: A List of Strings, where each element is a conversation.
tokenizer
priorOccurrences: A Map with occurrences of words as given by external corpora (wiki etc)
Example of usage:
import scala.io.Source
// Load the prior occurrences
val wordColumn = 1
val occurrenceColumn = 2
val filePath = "/Users/mal/pCloud/Data/word_frequency.tsv"
val priorOccurrences: Map[String, Int] = (for (line <- Source.fromFile(filePath).getLines)
yield (line.split("\t")(wordColumn).toLowerCase -> line.split("\t")(occurrenceColumn).toInt))
.toMap.withDefaultValue(0)
// instantiate the Conversations
val rawConversations = Source.fromFile("/Users/mal/pCloud/Scala/manaus/convs.head.csv").getLines.toList
val conversations = new Conversations(rawConversations=rawConversations, tokenizer=tokenizer,
priorOccurrences=priorOccurrences)
Created by Mario Alemi on 07/04/2017 in El Estrecho, Putumayo, Peru
conversations: A List of
String
s, where each element is a conversation. tokenizer priorOccurrences: A Map with occurrences of words as given by external corpora (wiki etc)Example of usage:
import scala.io.Source // Load the prior occurrences val wordColumn = 1 val occurrenceColumn = 2 val filePath = "/Users/mal/pCloud/Data/word_frequency.tsv" val priorOccurrences: Map[String, Int] = (for (line <- Source.fromFile(filePath).getLines) yield (line.split("\t")(wordColumn).toLowerCase -> line.split("\t")(occurrenceColumn).toInt)) .toMap.withDefaultValue(0) // instantiate the Conversations val rawConversations = Source.fromFile("/Users/mal/pCloud/Scala/manaus/convs.head.csv").getLines.toList val conversations = new Conversations(rawConversations=rawConversations, tokenizer=tokenizer, priorOccurrences=priorOccurrences)