My understanding (which may be better or worse than the suggestions above as i'm not an abosolute expert on this) is that you are close to the money with your origional suggestion of using robots.txt - with a couple buyer bewares.
Basically the three possible files you could edit are robots.txt, .htaccess or .htpasswd (there are MANY other ways to restrict content of course). .htpasswd I won't discuss as I have never really used it and I presume you don't want everyone to need a password to access the site.
By far the easiest to work with
in this case is the robots.txt. This file basically tells robots if they are or are not allowed to crawl your site. You can set this to disallow nothing, disallow all, or disallow specific directories (there is no "allow" only disallow - anything not disallowed is presumed allowed).
The following allows everything:
User-agent: *
disallow:
The following disallows everything (note the slash)
User-agent: *
disallow: /
And various versions are possible blocking specific bots (aka user-agents) or specific directories or combinations of the two. The only problems are you need to know the name of the bot that is doing the crawling you dont want (maybe from access logs), and that it isn't a restriction just an advice - a "bad" bot that hasn't been programmed to play nice will just ignore it and do what it wants.
With .htaccess you can restrict access to the whole site or directories but it's more complicated. AND it 's worth noting that this will apply to anyone and everyone, unless you know the IP address of the "bad bot" and it doesn't keep changing IP. PLUS, you then really should do tricky stuff to ensure the bot can still see robots.txt (blocking everything blocks robots.txt as well), as otherwise it will not find robots.txt, will presume it doesn't exist, will then try to crawl other pages, fail and then keep going - no real problem except a LOT of extra server traffic.
If you then want the pages removed from google as well just use webmaster tools as suggested above - not sure for other engines, though in time you should disapear if using robots.txt.
For more info try this link:
http://www.webmasterworld.com/robots_txt/3523560.htm
Regarding your specific problem with 123people, i'm guessing if they already have your info they aren't giving it up, and removing pages from the web after the horse has bolted won't help.
SORRY ALL for being so wordy, hopefully this may help a little bit not just cause more confusion!
Matt
Marketing Web
___________________________________
Anti Aging Medicine
Cattle Crushes
Promotional Bags
Free Christian Dating Australia