20 Commits

Author SHA1 Message Date
VC
2cb732efed Merge branch 'refresh_main' into 'main'
chore: update megalodon-rs to 0.11.7

See merge request veretcle/oolatoocs!11
2023-12-22 08:09:55 +00:00
VC
5d685b5748 chore: update megalodon-rs to 0.11.7 2023-12-22 09:06:00 +01:00
VC
66664ff621 Merge branch 'feat_better_split' into 'main'
Feat better split

See merge request veretcle/oolatoocs!10
2023-11-29 12:36:00 +00:00
VC
fd84730bdc feat: better split for twitter_count 2023-11-29 13:32:04 +01:00
VC
692f4ff040 chore: bump version 2023-11-29 13:31:45 +01:00
VC
3397416a93 Merge branch 'fix_twitter_count' into 'main'
Fix twitter count

See merge request veretcle/oolatoocs!9
2023-11-29 10:38:27 +00:00
VC
f782987991 chore: bump dependencies’ version 2023-11-29 11:28:10 +01:00
VC
26788f9d37 fix: properly count URL when preceeded by '\n' 2023-11-29 11:25:27 +01:00
VC
ca9b388a50 chore: bump version 2023-11-29 11:24:38 +01:00
VC
42958e0a92 Merge branch 'fix_u16' into 'main'
fix: use u16 instead of i64

See merge request veretcle/oolatoocs!8
2023-11-22 07:57:03 +00:00
VC
77be17e7bf fix: use u16 instead of i64 2023-11-22 08:53:17 +01:00
VC
bd9fd27fd1 Merge branch '5-feat-repeat-mastodon-poll-in-twitter' into 'main'
feat: add poll from Mastodon to Twitter + pass owned values in post_tweet

Closes #5

See merge request veretcle/oolatoocs!7
2023-11-21 22:20:49 +00:00
VC
3e6cae6136 feat: add poll from Mastodon to Twitter + pass owned values in post_tweet 2023-11-21 23:13:40 +01:00
VC
f10baa3eb2 Merge branch '1-find-a-way-to-not-carry-a-toot' into 'main'
Find a way to not carry a toot

Closes #1

See merge request veretcle/oolatoocs!6
2023-11-21 13:03:01 +00:00
VC
c113c1472a doc: add README 2023-11-21 13:59:14 +01:00
VC
cdf7dc70c1 feat: add #NoTweet to skip toot from being tweeted 2023-11-21 13:27:37 +01:00
VC
b1aed34f3c Merge branch '3-cut-toot-in-half-when-they-re-too-big' into 'main'
Cut toot in half

Closes #3

See merge request veretcle/oolatoocs!5
2023-11-20 14:53:19 +00:00
VC
e8bde4c779 feat: move media generation list to twitter.rs to avoid clutter 2023-11-20 15:32:02 +01:00
VC
80946ac131 chore: cargo update 2023-11-20 15:32:02 +01:00
VC
87b0567b59 feat: split toot into 2 tweets when necessary 2023-11-20 15:32:02 +01:00
6 changed files with 531 additions and 278 deletions

474
Cargo.lock generated

File diff suppressed because it is too large Load Diff

View File

@@ -1,11 +1,12 @@
[package] [package]
name = "oolatoocs" name = "oolatoocs"
version = "1.2.0" version = "1.5.4"
edition = "2021" edition = "2021"
# See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html # See more keys and their definitions at https://doc.rust-lang.org/cargo/reference/manifest.html
[dependencies] [dependencies]
chrono = "0.4.31"
clap = "^4" clap = "^4"
env_logger = "^0.10" env_logger = "^0.10"
futures = "^0.3" futures = "^0.3"

77
README.md Normal file
View File

@@ -0,0 +1,77 @@
# oolatoocs, a Mastodon to Twitter bot
So what is it? Originally, I wrote, with some help, [Scootaloo](https://framagit.org/veretcle/scootaloo/) which was a Twitter to Mastodon Bot to help the [writers at NintendojoFR](https://www.nintendojo.fr) not to worry about Mastodon: the vast majority of writers were posting to Twitter, the bot scooped everything and arranged it properly for Mastodon and everything was fine and dandy. It was also used, in an altered beefed-up version, for [Nupes.social](https://nupes.social) to make the tweets from the NUPES political alliance on Twitter, more easily accessible in Mastodon.
But then Elon came, and we couldnt read data from Twitter anymore. So we had to rely on copy/pasting things from one to another, which is not fun nor efficient.
Hence `oolatoocs`, which takes a Mastodon Timeline and reposts it to Twitter as properly as possible.
# Remarkable features
What it can do:
* Reproduces the Toot content into the Tweet;
* Cuts (poorly) the Toot in half in its too long for Twitter and thread it (this is cut using a word count, not the best method, but it gets the job done);
* Reuploads images/gifs/videos from Mastodon to Twitter
* Can reproduce threads from Mastodon to Twitter
* Can reproduce poll from Mastodon to Twitter
* Can prevent a Toot from being tweeted by using the #NoTweet (case-insensitive) hashtag in Mastodon
# Configuration file
The configuration is relatively easy to follow:
```toml
[oolatoocs]
db_path = "/var/lib/oolatoocs/db.sqlite3" # the path to the DB where toot/tweet are stored
[mastodon] # This part can be generated, see below
base = "https://m.nintendojo.fr"
client_id = "<REDACTED>"
client_secret = "<REDACTED>"
redirect = "urn:ietf:wg:oauth:2.0:oob"
token = "<REDACTED>"
[twitter] # youll have to get this part from Twitter, this can be done via https://developer.twitter.com/en
consumer_key = "<REDACTED>"
consumer_secret = "<REDACTED>"
oauth_token = "<REDACTED>"
oauth_token_secret = "<REDACTED>"
```
## How to generate the Mastodon keys?
Just run:
```bash
oolatoocs register --host https://<your-instance>
```
And follow the instructions.
## How to generate the Twitter part?
Youll need to generate a key. This is a real pain in the ass, but you can use [this script](https://github.com/twitterdev/Twitter-API-v2-sample-code/blob/main/Manage-Tweets/create_tweet.py), modify it and run it to recover you key.
Will I some day make a subcommand to generate it? Maybe…
# How to run
First of all, the `--help`:
```bash
A Mastodon to Twitter Bot
Usage: oolatoocs [OPTIONS] [COMMAND]
Commands:
init Command to init the DB
register Command to register to Mastodon Instance
help Print this message or the help of the given subcommand(s)
Options:
-c, --config <CONFIG_FILE> TOML config file for oolatoocs [default: /usr/local/etc/oolatoocs.toml]
-h, --help Print help
-V, --version Print version
```
Ideally, youll put it an cron (from a non-root user), with the default path for config file and let it do its job. Yeah, thats it.

View File

@@ -14,17 +14,13 @@ use mastodon::get_mastodon_timeline_since;
pub use mastodon::register; pub use mastodon::register;
mod utils; mod utils;
use utils::strip_everything; use utils::{generate_multi_tweets, strip_everything};
mod twitter; mod twitter;
#[allow(unused_imports)] #[allow(unused_imports)]
use twitter::{post_tweet, upload_chunk_media, upload_simple_media}; use twitter::{generate_media_ids, post_tweet, transform_poll};
use futures::{stream, StreamExt};
use log::{error, warn};
use megalodon::entities::attachment::AttachmentType;
use rusqlite::Connection; use rusqlite::Connection;
use std::error::Error;
#[tokio::main] #[tokio::main]
pub async fn run(config: &Config) { pub async fn run(config: &Config) {
@@ -40,60 +36,41 @@ pub async fn run(config: &Config) {
.unwrap_or_else(|e| panic!("Cannot get instance: {}", e)); .unwrap_or_else(|e| panic!("Cannot get instance: {}", e));
for toot in timeline { for toot in timeline {
let Ok(tweet_content) = strip_everything(&toot.content, &toot.tags) else { // detecting tag #NoTweet and skipping the toot
if toot.tags.iter().any(|f| &f.name == "notweet") {
continue;
}
// form tweet_content and strip everything useless in it
let Ok(mut tweet_content) = strip_everything(&toot.content, &toot.tags) else {
continue; // skip in case we cant strip something continue; // skip in case we cant strip something
}; };
let mut medias: Vec<u64> = vec![];
// if we wanted to cut toot in half, now would be the right time to do so
let media_attachments = toot.media_attachments.clone();
let mut stream = stream::iter(media_attachments)
.map(|media| {
let twitter_config = config.twitter.clone();
tokio::task::spawn(async move {
match media.r#type {
AttachmentType::Image => {
upload_simple_media(&twitter_config, &media.url, &media.description)
.await
}
AttachmentType::Gifv => {
upload_chunk_media(&twitter_config, &media.url, "tweet_gif").await
}
AttachmentType::Video => {
upload_chunk_media(&twitter_config, &media.url, "tweet_video").await
}
_ => Err::<u64, Box<dyn Error + Send + Sync>>(
OolatoocsError::new(&format!(
"Cannot treat this type of media: {}",
&media.url
))
.into(),
),
}
})
})
.buffered(4);
while let Some(result) = stream.next().await {
match result {
Ok(Ok(v)) => medias.push(v),
Ok(Err(e)) => warn!("Cannot treat media: {}", e),
Err(e) => error!("Something went wrong when joining the main thread: {}", e),
}
}
// threads if necessary // threads if necessary
let reply_to = toot.in_reply_to_id.and_then(|t| { let mut reply_to = toot.in_reply_to_id.and_then(|t| {
read_state(&conn, Some(t.parse::<u64>().unwrap())) read_state(&conn, Some(t.parse::<u64>().unwrap()))
.ok() .ok()
.flatten() .flatten()
.map(|s| s.tweet_id) .map(|s| s.tweet_id)
}); });
// if the toot is too long, we cut it in half here
if let Some((first_half, second_half)) = generate_multi_tweets(&tweet_content) {
tweet_content = second_half;
let reply_id = post_tweet(&config.twitter, first_half, vec![], reply_to, None)
.await
.unwrap_or_else(|e| panic!("Cannot post the first half of {}: {}", &toot.id, e));
reply_to = Some(reply_id);
};
// treats poll if any
let in_poll = toot.poll.map(|p| transform_poll(&p));
// treats medias
let medias = generate_media_ids(&config.twitter, &toot.media_attachments).await;
// posts corresponding tweet // posts corresponding tweet
let tweet_id = post_tweet(&config.twitter, &tweet_content, &medias, &reply_to) let tweet_id = post_tweet(&config.twitter, tweet_content, medias, reply_to, in_poll)
.await .await
.unwrap_or_else(|e| panic!("Cannot Tweet {}: {}", toot.id, e)); .unwrap_or_else(|e| panic!("Cannot Tweet {}: {}", toot.id, e));

View File

@@ -1,6 +1,12 @@
use crate::config::TwitterConfig; use crate::config::TwitterConfig;
use crate::error::OolatoocsError; use crate::error::OolatoocsError;
use log::debug; use chrono::Utc;
use futures::{stream, StreamExt};
use log::{debug, error, warn};
use megalodon::entities::{
attachment::{Attachment, AttachmentType},
Poll,
};
use oauth1_request::Token; use oauth1_request::Token;
use reqwest::{ use reqwest::{
multipart::{Form, Part}, multipart::{Form, Part},
@@ -26,6 +32,8 @@ struct Tweet {
media: Option<TweetMediasIds>, media: Option<TweetMediasIds>,
#[serde(skip_serializing_if = "Option::is_none")] #[serde(skip_serializing_if = "Option::is_none")]
reply: Option<TweetReply>, reply: Option<TweetReply>,
#[serde(skip_serializing_if = "Option::is_none")]
poll: Option<TweetPoll>,
} }
#[derive(Serialize, Debug)] #[derive(Serialize, Debug)]
@@ -38,6 +46,12 @@ struct TweetReply {
in_reply_to_tweet_id: String, in_reply_to_tweet_id: String,
} }
#[derive(Serialize, Debug)]
pub struct TweetPoll {
pub options: Vec<String>,
pub duration_minutes: u16,
}
#[derive(Deserialize, Debug)] #[derive(Deserialize, Debug)]
struct TweetResponse { struct TweetResponse {
data: TweetResponseData, data: TweetResponseData,
@@ -99,8 +113,49 @@ fn get_token(config: &TwitterConfig) -> Token {
) )
} }
pub async fn generate_media_ids(config: &TwitterConfig, media_attach: &[Attachment]) -> Vec<u64> {
let mut medias: Vec<u64> = vec![];
let media_attachments = media_attach.to_owned();
let mut stream = stream::iter(media_attachments)
.map(|media| {
let twitter_config = config.clone();
tokio::task::spawn(async move {
match media.r#type {
AttachmentType::Image => {
upload_simple_media(&twitter_config, &media.url, &media.description).await
}
AttachmentType::Gifv => {
upload_chunk_media(&twitter_config, &media.url, "tweet_gif").await
}
AttachmentType::Video => {
upload_chunk_media(&twitter_config, &media.url, "tweet_video").await
}
_ => Err::<u64, Box<dyn Error + Send + Sync>>(
OolatoocsError::new(&format!(
"Cannot treat this type of media: {}",
&media.url
))
.into(),
),
}
})
})
.buffered(4);
while let Some(result) = stream.next().await {
match result {
Ok(Ok(v)) => medias.push(v),
Ok(Err(e)) => warn!("Cannot treat media: {}", e),
Err(e) => error!("Something went wrong when joining the main thread: {}", e),
}
}
medias
}
/// This function uploads simple images from Mastodon to Twitter and returns the media id from Twitter /// This function uploads simple images from Mastodon to Twitter and returns the media id from Twitter
pub async fn upload_simple_media( async fn upload_simple_media(
config: &TwitterConfig, config: &TwitterConfig,
u: &str, u: &str,
d: &Option<String>, d: &Option<String>,
@@ -191,7 +246,7 @@ async fn metadata_create(
} }
/// This posts video/gif to Twitter and returns the media id from Twitter /// This posts video/gif to Twitter and returns the media id from Twitter
pub async fn upload_chunk_media( async fn upload_chunk_media(
config: &TwitterConfig, config: &TwitterConfig,
u: &str, u: &str,
t: &str, t: &str,
@@ -373,24 +428,38 @@ pub async fn upload_chunk_media(
Ok(orig_media_id.media_id) Ok(orig_media_id.media_id)
} }
pub fn transform_poll(p: &Poll) -> TweetPoll {
let poll_end_datetime = p.expires_at.unwrap(); // should be safe at this point
let now = Utc::now();
let diff = poll_end_datetime.signed_duration_since(now);
TweetPoll {
options: p.options.iter().map(|i| i.title.clone()).collect(),
duration_minutes: diff.num_minutes().try_into().unwrap(), // safe here, number is positive
// and cant be over 21600
}
}
/// This posts Tweets with all the associated medias /// This posts Tweets with all the associated medias
pub async fn post_tweet( pub async fn post_tweet(
config: &TwitterConfig, config: &TwitterConfig,
content: &str, content: String,
medias: &[u64], medias: Vec<u64>,
reply_to: &Option<u64>, reply_to: Option<u64>,
poll: Option<TweetPoll>,
) -> Result<u64, Box<dyn Error>> { ) -> Result<u64, Box<dyn Error>> {
let empty_request = EmptyRequest {}; // Why? Because fuck you, thats why! let empty_request = EmptyRequest {}; // Why? Because fuck you, thats why!
let token = get_token(config); let token = get_token(config);
let tweet = Tweet { let tweet = Tweet {
text: content.to_string(), text: content,
media: medias.is_empty().not().then(|| TweetMediasIds { media: medias.is_empty().not().then(|| TweetMediasIds {
media_ids: medias.iter().map(|m| m.to_string()).collect(), media_ids: medias.iter().map(|m| m.to_string()).collect(),
}), }),
reply: reply_to.map(|s| TweetReply { reply: reply_to.map(|s| TweetReply {
in_reply_to_tweet_id: s.to_string(), in_reply_to_tweet_id: s.to_string(),
}), }),
poll,
}; };
let client = Client::new(); let client = Client::new();

View File

@@ -3,6 +3,50 @@ use megalodon::entities::status::Tag;
use regex::Regex; use regex::Regex;
use std::error::Error; use std::error::Error;
/// Generate 2 contents out of 1 if that content is > 280 chars, None else
pub fn generate_multi_tweets(content: &str) -> Option<(String, String)> {
// Twitter webforms are utf-8 encoded, so we cannot count on len(), we dont need
// encode_utf16().count()
if twitter_count(content) <= 280 {
return None;
}
let split_content = content.split(' ');
let split_count = split_content.clone().count();
let first_half: String = split_content
.clone()
.take(split_count / 2)
.collect::<Vec<_>>()
.join(" ");
let second_half: String = split_content
.clone()
.skip(split_count / 2)
.collect::<Vec<_>>()
.join(" ");
Some((first_half, second_half))
}
/// Twitter doesnt count words the same we do, so youll have to improvise
fn twitter_count(content: &str) -> usize {
let mut count = 0;
let split_content = content.split(&[' ', '\n']);
count += split_content.clone().count() - 1; // count the spaces
for word in split_content {
if word.starts_with("http://") || word.starts_with("https://") {
count += 23;
} else {
count += word.chars().count();
}
}
count
}
pub fn strip_everything(content: &str, tags: &Vec<Tag>) -> Result<String, Box<dyn Error>> { pub fn strip_everything(content: &str, tags: &Vec<Tag>) -> Result<String, Box<dyn Error>> {
let mut res = strip_html_tags(&content.replace("</p><p>", "\n\n").replace("<br />", "\n")); let mut res = strip_html_tags(&content.replace("</p><p>", "\n\n").replace("<br />", "\n"));
@@ -48,6 +92,59 @@ fn strip_html_tags(input: &str) -> String {
mod tests { mod tests {
use super::*; use super::*;
#[test]
fn test_twitter_count() {
let content = "tamerelol?! 🐵";
assert_eq!(twitter_count(content), content.chars().count());
let content = "Shoot out to https://y.ml/ !";
assert_eq!(twitter_count(content), 38);
let content = "this is the link https://www.google.com/tamerelol/youpi/tonperemdr/tarace.html if you like! What if I shit a final";
assert_eq!(twitter_count(content), 76);
let content = "multi ple space";
assert_eq!(twitter_count(content), content.chars().count());
let content = "This link is LEEEEET\n\nhttps://www.factornews.com/actualites/ca-sent-le-sapin-pour-free-radical-design-49985.html";
assert_eq!(twitter_count(content), 45);
}
#[test]
fn test_generate_multi_tweets_to_none() {
// test «standard» text
let tweet_content =
"LOLOLOL, je suis bien trop petit pour être coupé en deux voyons :troll:".to_string();
let youpi = generate_multi_tweets(&tweet_content);
assert_eq!(None, youpi);
// test with «complex» emoji (2 utf-8 chars)
let tweet_content = "🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷🇫🇷".to_string();
let youpi = generate_multi_tweets(&tweet_content);
assert_eq!(None, youpi);
}
#[test]
fn test_generate_multi_tweets_to_some() {
let tweet_content = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ipsum dolor sit amet consectetur adipiscing elit pellentesque. Pharetra pharetra massa massa ultricies mi quis hendrerit dolor. Mauris nunc congue nisi vitae. Scelerisque varius morbi enim nunc faucibus a pellentesque sit amet. Morbi leo urna molestie at elementum. Tristique et egestas quis ipsum suspendisse ultrices gravida dictum fusce. Amet porttitor eget dolor morbi.".to_string();
let youpi = generate_multi_tweets(&tweet_content);
let first_half = "Lorem ipsum dolor sit amet, consectetur adipiscing elit, sed do eiusmod tempor incididunt ut labore et dolore magna aliqua. Ipsum dolor sit amet consectetur adipiscing elit pellentesque. Pharetra pharetra massa massa ultricies mi quis hendrerit dolor.".to_string();
let second_half = "Mauris nunc congue nisi vitae. Scelerisque varius morbi enim nunc faucibus a pellentesque sit amet. Morbi leo urna molestie at elementum. Tristique et egestas quis ipsum suspendisse ultrices gravida dictum fusce. Amet porttitor eget dolor morbi.".to_string();
assert_eq!(youpi, Some((first_half, second_half)));
}
#[test] #[test]
fn test_strip_mastodon_tags() { fn test_strip_mastodon_tags() {
let tags = vec![ let tags = vec![