一.高德地图数据爬取
1.爬取思路
首先,谷歌浏览器打开高德地图官网,点击上方菜单栏地铁进入地铁线路网站如下,网址:http://map.amap.com/subway/index.html。
按F12或右击检查进入调试页面,点击Network选项。在网页上先点击西安,可以发现箭头2出新增两行响应信息,鼠标左击可以发现箭头3处出现真实的请求地址等信息。
复制请求URL地址(http://map.amap.com/service/subway?_1612234237437&srhdata=6101_drw_xian.json),在浏览器新页面打开可以看见返回的是json数据,里面包含了各线路站点信息,正是我们想要的。
2.python核心代码
获取网页内容
def getHtml(url):
user_agent = random.choice(USER_AGENTS)
headers = {
"Host":"map.amap.com",
'User-Agent': user_agent
}
try:
response = requests.get(url, headers=headers)
#print(response.url)
text = response.content
return text
except:
print("爬取失败!")
解析json数据
def parse_page(text):
lines_list = json.loads(text).get('l')
# 地铁线路信息表
lineInfo_list = []
for line in lines_list:
#每条线的信息集合
lineInfo = {}
lineInfo['ln'] = line.get('ln')
print(lineInfo['ln'])
#线路站点列表
station_list = []
st_list = line.get('st')
for st in st_list:
station_dict = {}
station_dict['name'] = st.get('n')
coord = st.get('sl')
station_dict['lat'] = coord.split(',')[0]
station_dict['lon'] = coord.split(',')[-1]
print("站名称:", station_dict['name'])
print("经度:", station_dict['lat'])
print("纬度:", station_dict['lon'])
station_list.append(station_dict)
#pass
print('-----------------------------------')
lineInfo['st'] = station_list
lineInfo['kn'] = line.get('kn')
lineInfo['ls'] = line.get('ls')
lineInfo['cl'] = line.get('cl')
lineInfo_list.append(lineInfo)
#返回各线路信息列表
return lineInfo_list
保存站点数据(站名称、经纬度)
def save_file(filename, lineInfo):
#print("开始写入文件......")
with open(filename, 'a', encoding='utf-8') as f:
for st in lineInfo['st']:
f.write(st['name'] + " " + st['lat'] + " " + st['lon'] + "\n")
#print("写入文件完成!")
二.生成shp文件并导出图片
def create_shp(text,dirpath):
point_shpname = text.split('.')[0] + "_point.shp"
line_shpname = text.split('.')[0] + "_line.shp"
f = open(text, 'r')
lines = f.readlines()
spatRef = arcpy.SpatialReference(4326)
createFC = arcpy.CreateFeatureclass_management(dirpath, point_shpname, "POINT", "", "", "",spatRef)
arcpy.AddField_management(createFC, "name", "TEXT")
arcpy.AddField_management(createFC, "lat", "DOUBLE")
arcpy.AddField_management(createFC, "lon", "DOUBLE")
cur = arcpy.InsertCursor(createFC)
for line in lines:
info = line.strip().split(" ")
row = cur.newRow()
name = info[0]
point = arcpy.Point()
point.X = float(info[1])
point.Y = float(info[2])
pointGeometry = arcpy.PointGeometry(point)
row.shape = pointGeometry
row.name = name
row.lon = point.X
row.lat = point.Y
cur.insertRow(row)
#站点生成线
arcpy.PointsToLine_management(point_shpname, line_shpname)
将生成的点shp与线shp矢量文件加载到arcmap当中设置样式与符号大小,然后导出地图为图片。记得导出地图时图片分辨率选择为300dpi。
最终,如下图所示属于自己的地铁线路图就制作完成了。图片估计上传到微信上就不是原图了,又会变模糊,但是实际看起来还是比较清楚的。