Simple Email Manipulation with Mailgun

As most of my side projects start, I had a problem. I was in the market for a new cellphone (R.I.P. Galaxy SIII) and came across Swappa.com. They allow users to buy and sell used smartphones, and the prices seemed reasonable for what I was looking for.

The Problem

I signed up for an account and “subscribed” to a few listings of phones I was interested in. Pretty soon I started getting emails. Lots of emails:

The number in parentheses is how many emails are in each chain. So. Many. Emails.

And the problem was that I was getting a message for every LG G3 that was being added, when I really had more specific requirements. I wasn’t interested in the listing for a badly-beaten phone that was being sold for nearly-list-price. I wanted a deal.

The other problem with these emails is that they didn’t contain any useful information in them. Look:

Nothing useful outside of the actual link to the listing.

I don’t want to have to click the link on every single one of these emails just to see another listing I’m not interested in. So what could I do?

Introducing Mailgun

I stumbled across a service called Mailgun that allows you to accept incoming emails as POST variables, which is really cool. The way it works is pretty simple: when you sign up for a Mailgun account you set up an email address that routes emails through their servers. It sounds difficult, but they walk you through it, and with cPanel it amounts to copying and pasting things in the right spot. 

Once that’s set up, any emails sent to this new email address runs through the Mailgun routing system. You can set up forwarding rules depending on senders, which isn’t all that different from other email providers. But you can also configure routes to point to a web page, like this:

Mailgun routing.

So why would we want to do this? Well, sending an email to a PHP page in Mailgun will convert the email itself into POST variables, which then allows us to use those pieces to get what we actually want from them. So this is going to be our basic workflow:

Workflow.

Let’s do some manipulation. First, we grab the link for the listing from the email. You can see that it's as easy as calling a POST variable. I'm saving them into their own variable here, but that's not really necessary. I only do it to clean up my code later on.

  1. $sender = $_POST[‘sender’];
  2. $recipient = $_POST[‘recipient’];
  3. $subject = $_POST[‘subject’];
  4. $body = $_POST[‘body-plain’];

If you look at the original email that Swappa was sending, you'll see that the URL format is pretty straightforward. We can get the /listing/[device]/view link from the email pretty easily. Using the preg_match() function built into PHP:

  1. <?php preg_match(/(http|https):\/\/swappa\.com\/listing\/(\w+)\/view/, $body, $matches); ?>

Now that we have a link, we can use some scraping magic to grab the information from the website*.

There are a bunch of web scraping tools out there, but for this project I chose to use the PHP Simple HTML DOM Parser library. It works, and it's pretty easy to figure out. 

Now we can combine everything with a few if statements to only send a (customized) email to ourselves with the information we need when certain criteria are met. Here's what my code looks like when it's all done.

  1. <?php
  2. require('path/to/simplehtmldom/simple_html_dom.php');
  3.  
  4. // Check if this page is being hit via an email or a web browser.
  5. if (isset($_REQUEST['sender']))
  6. {
  7. // Get email pieces.
  8. $sender = $_REQUEST['sender'];
  9. $recipient = $_REQUEST['recipient'];
  10. $subject = $_REQUEST['subject'];
  11. $body = array(
  12. 'plain' => $_REQUEST['body-plain'],
  13. 'html' => $_REQUEST['body-html'],
  14. );
  15. // Get link from email.
  16. $link = find_link($body['plain']);
  17.  
  18. // Scrape link for relevant info.
  19. $info = scrape_swappa($link);
  20. $info = $info[0];
  21.  
  22. // Get the file with the list of devices & prices.
  23. $device = array();
  24. $x=0;
  25. $data_file = fopen("devices.txt", "r") or die("Unable to open file!");
  26. // For each line of the file, save the info in an array.
  27. while(!feof($data_file)) {
  28. $line = fgets($data_file);
  29. $devices_array = explode('|', $line);
  30. $devices[$x]['device'] = $devices_array[0];
  31. $devices[$x]['price'] = $devices_array[1];
  32. $x++;
  33. }
  34. fclose($data_file);
  35.  
  36. // We don't care about devices in good or fair condition.
  37. $bad_condition = array('Good', 'Fair');
  38. foreach ($devices as $device)
  39. {
  40. // Check if the device from swappa is in our list of devices.
  41. if (strpos($device['device'], $info['description']) !== FALSE)
  42. {
  43. // Check if the device price and condition match our requirements.
  44. if ($info['price'] >= $device['price']
  45. && !in_array($info['condition'], $bad_condition))
  46. {
  47. // Populate email parts with new info.
  48. $subject = '[Swappa] ($' . $info['price'] . ') '
  49. . $info['condition'] . ' | ' . $info['description'];
  50. $body = 'A new device matching your parameters ('.$device['device']
  51. . ' for less than $' . $device['price'].') was posted on Swappa,
  52. check it out: ' . $link;
  53.  
  54. // Send email.
  55. $result = send_message($subject, $body);
  56. return $result;
  57. }
  58. // If it doesn't meet our criteria, fake it so Mailgun stops attempting
  59. // to resend.
  60. else
  61. {
  62. return '200';
  63. }
  64. }
  65. // If it doesn't meet our criteria, fake it so Mailgun stops attempting
  66. // to resend.
  67. else
  68. {
  69. return '200';
  70. }
  71.  
  72. }
  73.  
  74. }
  75. // This is being hit via a web browser, probably for testing.
  76. else
  77. {
  78. // Testing the scraping to make sure it works.
  79. if (isset($_GET['url'])) {
  80. $url = $_GET['url'];
  81. $info = scrape_swappa($url);
  82. $info = $info[0];
  83.  
  84. var_dump($info);
  85. }
  86. // Testing the file reading to make sure it works.
  87. else
  88. {
  89. $info = array();
  90. $x=0;
  91. $myfile = fopen("devices.txt", "r") or die("Unable to open file!");
  92. // Output one line until end-of-file
  93. while(!feof($myfile)) {
  94. $line = fgets($myfile);
  95. $info_array = explode('|', $line);
  96. $info[$x]['device'] = $info_array[0];
  97. $info[$x]['price'] = $info_array[1];
  98. $x++;
  99. }
  100. fclose($myfile);
  101. foreach ($info as $i)
  102. {
  103. print '<p>Device: ' . $i['device'] . '<br>';
  104. print 'Price: ' . $i['price'] . '</p>';
  105. }
  106. }
  107.  
  108. }
  109.  
  110.  
  111.  
  112.  
  113. /**
  114.  * Functions
  115. */
  116.  
  117. function find_link($email_text) {
  118. preg_match('/(http|https):\/\/swappa\.com\/listing\/(\w+)\/view/',
  119. $email_text, $matches);
  120.  
  121. return $matches[0];
  122. }
  123.  
  124. function scrape_swappa($url) {
  125. // Create HTML DOM.
  126. $html = file_get_html($url);
  127.  
  128. // Find the page title section with the info we need.
  129. foreach($html->find('section#section_alerts') as $section) {
  130. // Get description.
  131. $description = trim($section->find('span[itemprop=description]', 0)
  132. ->plaintext);
  133. // Get price.
  134. $item['price'] = trim($section->find('span[itemprop=price]', 0)
  135. ->plaintext);
  136. // Get condition from description.
  137. $condition_array = explode('in ', $description);
  138. $item['condition'] = ltrim(substr($condition_array[1], 0, -10));
  139. $item['description'] = rtrim($condition_array[0], ', ');
  140.  
  141.  
  142. $ret[] = $item;
  143. }
  144.  
  145. // Clean up memory.
  146. $html->clear();
  147. unset($html);
  148. return $ret;
  149. }
  150.  
  151.  
  152.  
  153. function send_message($subject, $body) {
  154. $mg_api = '[your-mailgun-api-key]';
  155. $mg_version = 'api.mailgun.net/v3/';
  156. $mg_domain = "your.mailgun.domain.com";
  157. $mg_from_email = "[email protected]";
  158. $mg_reply_to_email = "[email protected]";
  159.  
  160. $mg_message_url = "https://".$mg_version.$mg_domain."/messages";
  161.  
  162.  
  163. $ch = curl_init();
  164. curl_setopt($ch, CURLOPT_HTTPAUTH, CURLAUTH_BASIC);
  165.  
  166. curl_setopt ($ch, CURLOPT_MAXREDIRS, 3);
  167. curl_setopt ($ch, CURLOPT_FOLLOWLOCATION, false);
  168. curl_setopt ($ch, CURLOPT_RETURNTRANSFER, 1);
  169. curl_setopt ($ch, CURLOPT_VERBOSE, 0);
  170. curl_setopt ($ch, CURLOPT_HEADER, 1);
  171. curl_setopt ($ch, CURLOPT_CONNECTTIMEOUT, 10);
  172. curl_setopt ($ch, CURLOPT_SSL_VERIFYPEER, 0);
  173. curl_setopt ($ch, CURLOPT_SSL_VERIFYHOST, 0);
  174.  
  175. curl_setopt($ch, CURLOPT_USERPWD, 'api:' . $mg_api);
  176. curl_setopt($ch, CURLOPT_RETURNTRANSFER, 1);
  177.  
  178. curl_setopt($ch, CURLOPT_POST, true);
  179. //curl_setopt($curl, CURLOPT_POSTFIELDS, $params);
  180. curl_setopt($ch, CURLOPT_HEADER, false);
  181.  
  182. //curl_setopt($ch, CURLOPT_CUSTOMREQUEST, 'POST');
  183. curl_setopt($ch, CURLOPT_URL, $mg_message_url);
  184. curl_setopt($ch, CURLOPT_POSTFIELDS,
  185. array( 'from' => $mg_from_email,
  186. 'to' => [email protected]',
  187. 'h:Reply-To'=> ' <' . $mg_reply_to_email . '>',
  188. 'subject' => $subject,
  189. 'html' => $body,
  190. ));
  191. $result = curl_exec($ch);
  192. curl_close($ch);
  193. $res = json_decode($result,TRUE);
  194. return $res;
  195. }
  196. ?>

* This is a moral grey area, so tread lightly. A lot of websites frown upon the scraping of their websites, while others believe that any publicly-available content should be fair game.