May 2016 – Aaron Sustar

Text processing tweaked further…

May 16, 2016 / Aaron

In my previous post I already mentioned how we’re rolling out some minor tweaks and fixes for various edge-cases when it comes to automated processing of original texts before our ENL Semantic Spinning algorithms begin their work.

Now, in some extreme cases our processing algorithm could sometimes get confused when it encountered the “<” sign because it thought that maybe it signifies the beginning of an HTML tag. All HTML tags begin with “<” and then usually end with either “/>” or another “>” immediately after the name of the tag.

Here’s an example of what used to cause a problem for our parser:

Mathematicians sometimes use p < 0.01% to signify low risk.

This would in some cases (depending on the context) turn into:

Mathematicians sometimes use p (the rest of the sentence is missing)

…after the Step 1 processing because our parser mistakenly took the “< 0.01%” as a malformed HTML tag instead of what it truly was.

Well, it certainly doesn’t do that anymore — we just fixed this issue, and rolled the update out to our production servers! 😀

New minor updates to text processing…

May 11, 2016 / Aaron

We’ve just rolled out another tweak to our algorithm, namely the part of the algorithm that takes you from Step 1 to Step 2 of the spinning process inside Spin Rewriter.

There used to be some edge cases where our algorithm would incorrectly place a period (.) at the end of paragraphs if the paragraph in question ended with either a single quote or a double quote, preceeded by another period.

Here’s what I mean:

This right here is a “test.” (end of paragraph)

…would sometimes turn into…

This right here is a “test.”. (end of paragraph)

…after the Step 1 processing, and you would find that additional period once taken to Step 2.

Mildly annoying, for sure. 😀 Well, not anymore — we just fixed that, and rolled out the update to our live servers! 😉

We’ve rolled out an improved DB structure

May 2, 2016 / Aaron

As I’ve already mentioned in my previous Redis-related post, we’re improving the data-related structure of our back-end algorithms in order to deliver even more efficient, faster ENL Semantic Spinning to our awesome Spin Rewriter users.

This time around we’ve rolled out an update to the structure of our regular SQL databases. This update has significantly reduced the size of some of the database tables (in some cases, the improvement has been as high as 64-65% which is astonishing) which resulted in much faster database queries…

…which once again translates into a faster & better user experience, and a more robust platform!

Latest updates

Recent posts

Popular topics

Month: May 2016

Text processing tweaked further…

New minor updates to text processing…

We’ve rolled out an improved DB structure