R-universe - kasperwelbers (Kasper Welbers)

corpustools: Managing, Querying and Analyzing Tokenized Text1 years ago

Introduction | Creating a tcorpus | creating a tcorpus from full-text | Additional options | Importing a tokenlist | Managing a tCorpus | Adding, removing and mutating columns | Subsetting a tCorpus | Deduplication | Preprocessing | Basic preprocessing | Advanced preprocessing with UDPipe | Create_tcorpus keeps a persistent cache | Using multiple cores | Filtering tokens | Creating a DTM or DFM | Why keep the full corpus intact? | Querying the tcorpus | search_features() | Counting hits and plotting | Associations | Inspect results in full text | Adding query hits as token features | search_contexts() | Subset by search_contexts() | search_dictionary | Text analysis techniques | Semantic networks based on co-occurence | Corpus comparisons | Feature associations | Using the tcorpus R6 methods | Being carefull with shallow copies. | Copying a tCorpus

corpustools 0.5.2by Kasper Welbers and Wouter van Atteveldtcorpustools.Rmd

RNewsflow: Tools for analyzing content homogeneity and news diffusion using computational text analysis2 years ago

Abstract | Introduction | Preparing the data | Pre-processing texts and creating the DTM | Using word statistics to filter and weight the DTM | Calculating document similarities | Tailoring the document comparison window | Analyzing the document similarity network | Aggregating the document similarity network | Inspecting and visualizing results | Alternative applications of this package | Conclusion and future improvements | Practical code example | References

RNewsflow 1.2.8by Kasper Welbers and Wouter van AtteveldtRNewsflow.Rmd