Storing HTTP_X_FORWARDED_FOR in Django
I occasionally get a
ProgrammingError: value too long for type character(15)
when people post to my blog. There aren't any fields in my model declared to have a max_length of 15 so I was always a little confused and in almost all cases, it was a spammer anyway so never took the time to investigate further.
But then someone just emailed me and told me they were getting a 500 when posting a comment to my blog. So I decided to investigate and that's where it started getting interesting...
Doing
./manage.py sql leonardo
revealed no sign of a field of length 15 either. So I went into the DB (in my case PostgreSQL)'s shell.
A quick
\d leonardo_comment
revealed
author_ipaddress | character(15)
Like many blogs, I capture the IP address of the commenter so I can block spam.
In my model I have:
author_ipaddress = models.IPAddressField(null=True)
Which Django's ORM translates to:
"author_ipaddress" inet NULL
which PostgreSQL is obviously storing as a character(15).
Why would an IP Address be more than 15 characters, though?
Well, I went back to the error log and noticed this:
'HTTP_X_FORWARDED_FOR': '192.168.0.127, 12.34.56.78',
(note: I changed the second address to protect the original poster)
You see, because the Apache instance running django is behind another webserver (on the same machine), I can't rely on REMOTE_ADDR because it's always 127.0.0.1. So I log HTTP_X_FORWARDED_FOR.
What I didn't realise until now is that HTTP_X_FORWARDED_FOR can be a list.
I guess the best solution is to just change the field to a CharField.
Other Djangonauts who are logging HTTP_X_FORWARDED_FOR might want to heed this warning: don't use IPAddressField.
Comments (7)
Henrik Lied on Nov. 11, 2008:
I had the same problem last week. I was trying to create som statistics for a video site, and GeoIP kept complaining that my IP adress was formatted incorrectly. I simply did a .split(', ')[1] on the IP adresses, and fixed it quickly that way.
Van Gale on Nov. 11, 2008:
I was just browsing the Django resources list on the Django wiki and saw this:
http://blog.holsman.net/2006/8/handy-django-middleware-for-shared-hosting
James Tauber on Nov. 11, 2008:
Interestingly, Henrik takes [1] from split and the code in Van's link takes [0].
I'm not sure if the ordering of HTTP_X_FORWARDED_FOR is consistent, but certainly in my experience today, [1] is more useful than [0]. I wonder, though, if it should actually be [-1]
Either way, I think ultimately keeping the entire list is probably the best thing to do.
Chris Lambacher on Nov. 11, 2008:
If you are behind a reversed proxied Apache, you want the "last" address (i.e. address.split[-1]). Apache will append the requester's IP to the header if a value already exists.
It is possible that the requester has gone through more than one proxy. HTTP_X_FORWARDED_FOR is also often spoofed by spammers/script kiddies in an attempt to hack servers. They could put a comma in at any point. In either case address[0] and address[1] would be incorrect.
Brian Rosner on Nov. 11, 2008:
Take a look at django.middleware.http.SetRemoteAddrFromForwardedFor. It sets REMOTE_ADDR for you from HTTP_X_FORWARDED_FOR and handles a list for you. Also see http://code.djangoproject.com/ticket/3872
Doug Napoleone on Nov. 11, 2008:
I have yet to see a full solution for this.
I wrote something in wsgi to deal with the bots which were spoofing this field. In the end a separate perl script is sent the information and manages it (updating the firewall). We track everything and if the first IP is not as expected, we treat it as a bot playing games.
For a full solution I would want something which breaks out all the IP's and stores them, marking where they came from, and have something which notices if an expected IP (depending on config it could be the first last or center IP, yes there can be more than 2) set up triggers which people can plug in handlers.
Of course I want the SUV w/ off road capabilities even though I live in a city :-)
Last Modified: Nov. 11, 2008
Author: James Tauber
Marc Fargas on Nov. 11, 2008:
Oh I see my LAN IP address there! ;)
I'd guess that the header has all the clients/proxies the request wen trhought; As I have a squid proxy here that sets HTTP_X_FORWARDED_FOR with my IP (the LAN one), then when it reaches your proxy it appends another IP to the list (aka, my external one).
It's not a very common thing as not everybody runs behind proxies those days anyway ;)))