Thursday, May 22, 2008

Using Statistics to See in the Dark--veinglory

The best way to predict how your ebook will sell with an epublisher is to know how their other ebooks have sold. But there are plenty of perfectly good epublishers where that data is not available. However you can still have a go at predicting what their sales might be like.

The best approach is to try and find a piece of publicly available information that might correlate with sales. I decided to try the size of the publisher's yahoogroup. If they had more than one I chose the one aimed at readers and focused on erotic romance. I then looked at the correlation number of members and average first month sales.

So for the epublishers we do have sales data for there is a close relation between these two variables. In fact, the fitted line predicts a sales figure that is 98% accurate compared with the actual sales figure data. Now how well this relation will describe new data really cannot be predicted, as there will be many factors that influence whether/how a yahoogroup is used and how well it is run by a publisher and we are starting with just 8 data points.

But we will see how well this predictor works once I get data for new presses. If you want to make a prediction of your own, you need to find the epublishers yahoogroup membership number, multiply it by 00.16 and then add 28.73. As the data set expands this might nor prove the best predictor, some other variable of combination of variable might work better. But it's a start.

If you have some variable other than yahoogroup size you would like to look at, but you are not sure how to make this calculation, I can send an MS-excel sheet set up to run the regression for you.


veinglory said...

Oh, come on peeps. Linear regression! 98%! That's got to be worth a comment! [sigh] ;)

Anonymous said...

Oh yeah, I get this! Brings back memories of AP Statistics class (and bio, chem, phys labs) when I tested tons of stuff for correlation. I got a thrill every time I found a correlation. It’s so cool you found a linear correlation between # of yahoogroup members and avg first-month sales! I wouldn’t have guessed those 2 variables have such a strong linear correlation. (98%!) Hope when you get more data the correlation will still be strong.

A Summary:
Emily’s experiments produced a pretty good formula:
average first-month sales = 0.16*yahoogroup size + 28.73

veinglory said...

LOL. Thanks for the encouragement.