InternetAgent is a tool for conducting web-based searches, retrieving relevant web pages,
chunking their content, and summarizing them using a specified language model.
Attributes:
chunk_size (int): Maximum token size for each chunk of webpage content.
summerize_agent (Agent): An instance of the summarization agent for generating summaries
from text chunks based on a system prompt.
Args:
chunk_size (int): Token limit for text chunking.
model (str): The name or identifier of the language model to be used.
base_url (str): Base URL for the API that powers the summarization model.
api_key (str): API key for authenticating with the model provider.
temperature (float, optional): Sampling temperature for generation. Defaults to 0.1.
max_token (int, optional): Maximum number of tokens allowed in the summary output. Defaults to 512.
provider (str, optional): Name of the model provider (e.g., "openai"). Defaults to "openai".
Methods:
start(query: str, num_result: int) -> list:
Executes a web search for the given query, retrieves the content of top results,
splits them into chunks, summarizes them using the summarization agent,
and returns a list of dictionaries with URL, title, and summarized content.
list[dict] iragent.models.InternetAgent.fast_start |
( |
| self, |
|
|
str | query, |
|
|
int | num_result, |
|
|
int | None | max_workers = None ) |
A convenience wrapper that searches the Web, fetches the content of each hit,
breaks the text into token‑limited chunks, and asks a language‑model “summarizer”
to extract only the information relevant to the user’s query.
----------
Attributes
----------
chunk_size : int
Maximum token length for each text chunk before it is passed to the
summarization model.
summerize_agent : Agent
A pre‑configured LLM “agent” used to turn a chunk of raw page text
into a concise, query‑focused summary.
----------
Parameters
----------
chunk_size : int
Token limit used when splitting page text.
model : str
Name / identifier of the language model (e.g. ``"gpt-4o-2025-05-13"``).
base_url : str
Base URL for the model’s API endpoint.
api_key : str
API key or access token.
temperature : float, optional (default = 0.1)
Sampling temperature.
max_token : int, optional (default = 512)
Maximum length of each summary returned by the LLM.
provider : str, optional (default = ``"openai"``)
Identifies the backend. Special‑casing is included for ``"ollama"``
because its local HTTP server dislikes shared clients in a pool of
threads.
----------
Methods
----------
start(query, num_result)
Serial implementation – easy to read, useful for debugging.
fast_start(query, num_result, max_workers=None)
Threaded implementation that parallelises I/O for speed.
_summarize_page(result, query)
Worker routine run by each thread in ``fast_start``. Not public.
All methods return a ``list[dict]`` whose items look like::
{
"url": "<page URL>",
"title": "<page title>",
"content": "<summarised text>",
}