Home
README - tscrape - twitter scraper (not working anymore) HTML git clone git://git.codemadness.org/tscrape DIR Log DIR Files DIR Refs DIR README DIR LICENSE --- README (1753B) --- 1 tscrape 2 ------- 3 4 Twitter feed parser. 5 6 It parses JSON from stdin and outputs it to a TAB-separated format that can be 7 processed easier with various (UNIX) tools. There are formatting programs 8 included to convert this TAB-separated format to various other formats. There 9 are also some programs and scripts included to import and export OPML and to 10 fetch, filter, merge and order items. 11 12 The name tscrape is used because it used to scrape the HTML from the Twitter 13 page. It is now using the JSON API. 14 15 16 Build and install 17 ----------------- 18 19 $ make 20 # make install 21 22 23 Usage 24 ----- 25 26 * Create a tscraperc configuration file in ~/.tscrape/tscraperc, see tscraperc.example. 27 * Run tscrape_update 28 29 30 Using sfeed to convert the tscrape TSV output to an Atom feed: 31 32 awk 'BEGIN { OFS = FS = "\t"; } 33 { 34 print $1 OFS $4 OFS "https://twitter.com/" $6 "/status/" $5 \ 35 OFS "" OFS "" OFS $5 OFS $7 OFS ""; 36 }' ~/.tscrape/feeds/* | sfeed_atom 37 38 sfeed can be found at: https://codemadness.org/git/sfeed/file/README.html 39 40 41 Why 42 --- 43 44 Twitter removed the functionality to follow users using a RSS feed without 45 authenticating or using their API. With this program you can format tweets in 46 any way you like relatively anonymously. 47 48 49 Dependencies 50 ------------ 51 52 - C compiler (C99). 53 - libc (recommended: C99 and POSIX >= 200809). 54 55 56 Optional dependencies 57 --------------------- 58 59 - POSIX make(1) for Makefile. 60 - POSIX sh(1), 61 used by tscrape_update(1). 62 - curl(1) binary: https://curl.haxx.se/ , 63 used by tscrape_update(1), can be replaced with any tool like wget(1), 64 OpenBSD ftp(1) or hurl(1): https://git.codemadness.org/hurl/ 65 - mandoc for documentation: https://mdocml.bsd.lv/ 66 67 68 License 69 ------- 70 71 ISC, see LICENSE file. 72 73 74 Author 75 ------ 76 77 Hiltjo Posthuma <hiltjo@codemadness.org>