튜터봇 크롤링 https://aitutor21.com/ailink/862 사이트를 크롤링 하는 코드 만들어줘
작성자 정보
- 작성자 bryanai
- 작성일
컨텐츠 정보
- 조회 1,077
본문
Here is a Python code to crawl the website `https://aitutor21.com/ailink/862` using BeautifulSoup and requests:
```python
import requests
from bs4 import BeautifulSoup
# URL to crawl
url = "https://aitutor21.com/ailink/862"
# Send a GET request to the website
response = requests.get(url)
# Check if the request was successful
if response.status_code == 200:
# Parse the HTML content
soup = BeautifulSoup(response.content, 'html.parser')
# Extract the title of the page
title = soup.title.string
print("Title:", title)
# Extract all paragraphs
paragraphs = soup.find_all('p')
for i, paragraph in enumerate(paragraphs):
print(f"Paragraph {i+1}: {paragraph.get_text()}")
# Extract other specific elements if needed
# For example, extracting the first header
header = soup.find('h1')
if header:
print("Header:", header.get_text())
else:
print("Failed to retrieve the webpage. Status code:", response.status_code)
```
This code will:
1. Send a GET request to the provided URL.
2. Parse the HTML content using BeautifulSoup.
3. Extract and print the title of the page.
4. Extract and print all paragraph texts.
5. Extract and print the first header if available.
Make sure to install BeautifulSoup and requests using pip if you haven't already:
```sh
pip install beautifulsoup4 requests
```
Let me know if you need any more specific information or features to be added to this crawler!
관련자료
-
이전
-
다음