Update README.md

2023-09-17 16:45:02 -03:00 · 2023-09-17 16:45:02 -03:00 · ff2223d7e8
parent 46b3aec315
commit ff2223d7e8
1 changed files with 2 additions and 2 deletions
--- a/README.md
+++ b/README.md
@ -17,8 +17,8 @@ Consider doing your own data analysis. If you save your CSV, and make a pull req
 **Limitations**
 - 4chan will recycle poster IDs, so they are not unique identifiers. Therefore, the data mining on n-pbtid isn't fully accurate for the upper-bound, but it should be closer to representative at the lower-bound.
 - Time of day scrapes based on time-zone of interest (e.g. New York posting hours) hasn't been implemented, but this can be easily solved by scraping the threads from 0900h - 1700h by intervals of 3 hours to allow old threads to die, for your target local time.
- Some synonyms will not be counted under a single word (e.g. glowie, glow, glows, glower), so manual input will need to be implemented in the v2 script. The same idea goes for plural words, ~~but there is another package that will take root words, and remove word modifiers (e.g. -ing, -ed, -s, etc)~~
- ~~n-gram analysis has not been completed, but a simple edit to the v2 script will accomodate this. n-gram means the number of paired words (e.g. 2gram = he is, she is, it is; 3gram = i am going, she is going, he is going; etc). This can provide more context of rhetoric within the entire board scrape.~~
+- Some synonyms will not be counted under a single word (e.g. glowie, glow, glows, glower), so manual input will need to be implemented in the v2 script. The same idea goes for plural words, **The package is trash**~~but there is another package that will take root words, and remove word modifiers (e.g. -ing, -ed, -s, etc)~~
+- **It has been completed** ~~n-gram analysis has not been completed, but a simple edit to the v2 script will accomodate this. n-gram means the number of paired words (e.g. 2gram = he is, she is, it is; 3gram = i am going, she is going, he is going; etc). This can provide more context of rhetoric within the entire board scrape.~~

 # html_text vs html_text2 from rvest