Please and Thank You

Comments

Show More Comments

Follow Comments Sorted by time

lihea · 9 years ago · FIRST

Amen

7 Reply

smbadat · 9 years ago

This post is so great, I'm gonna post it tomorrow

lihea · 9 years ago

Smbadat, NOOOOOOOOOOOOOOOOOOO!

smbadat · 9 years ago

No = Yes
-Rapists everywhere

▼

4 more replies hidden. Show All

deleted · 9 years ago

I told them to implement a system to prevent reposts. I even gave them some code. But "they have a better system".
As you can see, they do not. But oh well.

Zeus · 9 years ago

The solution you recommended is a more accurate algorithm but it's also very much slower. The fact that it doesn't have indexing already makes it impractical. The second problem is the nature of our content.
1. Content are often presented differently e.g. the same text can appear in a tweet, tumblr post, or on a photo
2. Details of the content matter. Take for example a tumblr post or a generated meme. The difference is either too large or too small to be considered a repost. That means either lots of false positive or false negative. Of course this is still more accurate than the implemented algorithm, but not enough to justify the drawbacks.
3. We can no longer filter or index images and the speed of the algorithm is O(n). Take for example 300 posts a day * 7 days = 2100. Optimistically (very) each comparison is 0.1 seconds =210 seconds for each user upload.
So yes we do have a better system. We can't afford to spend months or years developing a perfect repost detection.

Zeus · 9 years ago

When you've actually tried it yourself you'll understand why there are so few good solution. The only one I know of is TinEye. Just try searching this post's image in TinEye or Google Image Search.

5 more replies hidden. Show All

smbadat · 9 years ago

Dayum
Zeus be slayin'

8 · Edited 9 years ago

deleted · 9 years ago

Thanks for answering a bit more detailed this time.
I can agree that a O(n) algorithm for a large amount of posts is not really optimal.
But you can indeed index it in some way: you can store hashes (and/or any other feature used) in the database, making a lookup of it already better than o(n), and without implementing the system yourself.
I can see the problem with the content: posts with the same memes or from the same platform can cause trouble.
Now i'm curious about this. I'm going to try coding something and see how it works out.
-
And anyways, aside from all of this: what is your current system? Because i have seen the same exact post, from the same exact source posted and it got through it. Is it the "are you sure this isn't a repost" message?
Again, thanks for taking the time in answering

Zeus · 9 years ago

Let's consider the pros/cons of using hash plus feature detection. Use a hash with low precision so the feature detection can decide if it's a repost. With this, reposts which look different are already left out. Now feature detection would be able to detect detailed images such as text e.g. memes.
So now we've slightly better accuracy with speed penalty e.g. ~2 seconds. But in the current system we let the user do the feature detection part but with a higher hash precision. Also would be more work to implement feature detection.

deleted · 9 years ago

Yes, i do see your point.
And talking about TinyEye and Google, their function is not to completely match the image but also show the ones that resemble them. The problem we'd have here with memes it's actually beneficial for them.
Thank you again for answering, and i'll probably try to code something this summer and see how it ends up, just because i'm curious about the implementation of a system like this. If it ends up being a rather good method i'll email you guys about it :)

deleted · 9 years ago

Sassy

deleted · 9 years ago

I've been getting bitched at for 3 days now for calling out actual repost

lihea · 9 years ago

I've just been downvoting them and reporting them as reposts....Idk if that actually helps any, but i guess if we all follow their system (downvote and report reposts) and stuff still doesn't get better we at least have a very good case for why there needs to be another system put in place.

lihea · 9 years ago

I'm getting downvoted for saying I downvote reposts? Well, k.

deleted · 9 years ago

Maybe somebodys on a downvote spree

Reply · Edited 9 years ago

frxck · 9 years ago

To be really honest, why is reposting THAT serious? Just fucking scroll past to the next image.

▼ Reply

wolfpack · 9 years ago

I don't mind it just wish they gave the original poster the credit

▼

smbadat · 9 years ago

So you'd be happy if you saw the exact same thing today as yesterday?

frxck · 9 years ago

Sure. I would just scroll past. I do it all the time. Just scroll past.

1 Reply

woah · 9 years ago

Posting this tommorow

2 Reply

Feedback Form

Please and Thank You

Comments