#data-formatting

[ follow ]
#ai
Data science
fromMedium
5 days ago

Data models: the shared language your AI and team are both missing

Understanding the attention mechanism in AI is crucial for effective use of AI tools.
Data science
fromTheregister
2 weeks ago

Datadog bets DIY AI will mean it dodges the SaaSpocalypse

Datadog is releasing an AI model to enhance its observability tools and mitigate risks from customers building their own solutions.
Information security
fromNextgov.com
1 day ago

Data is a strategic asset and a strategic vulnerability

Data is a primary strategic asset in national security, transforming into both a powerful tool and a critical vulnerability.
Tech industry
fromnews.bitcoin.com
54 minutes ago

AI Cloud Provider Coreweave Secures Anthropic Agreement for Claude Workloads

Coreweave signed a multi-year agreement with Anthropic to provide cloud infrastructure for AI model development and deployment.
#postgresql
DevOps
fromInfoQ
1 day ago

Google Cloud Highlights Ongoing Work on PostgreSQL Core Capabilities

Google Cloud has made significant technical contributions to PostgreSQL, enhancing logical replication, upgrade processes, and system stability.
DevOps
fromInfoQ
1 day ago

Google Cloud Highlights Ongoing Work on PostgreSQL Core Capabilities

Google Cloud has made significant technical contributions to PostgreSQL, enhancing logical replication, upgrade processes, and system stability.
Business intelligence
fromZDNET
2 days ago

I asked 5 data leaders about how they use AI to automate - and end integration nightmares

Strong processes and AI integration are essential for businesses to effectively utilize data.
Privacy professionals
fromComputerworld
2 days ago

Questions raised about how LinkedIn uses the petabytes of data it collects

LinkedIn users should limit identifiable data exposure and treat the platform as potentially hostile until BrowserGate allegations are verified.
Data science
fromFast Company
4 days ago

Data, not infrastructure, must drive your AI strategy

Data centricity is essential for effective AI strategies, enabling collaboration and problem-solving across business units by making data accessible.
Software development
fromTechzine Global
3 days ago

Rubrik SAGE: using AI to govern agentic workforces

Enterprises must adopt AI-powered semantic governance to manage the security challenges posed by creative and non-deterministic agentic AI systems.
Web development
fromSpeckyboy Design Magazine
5 days ago

How to Use Remote Data Blocks to Display Google Sheets Data in WordPress - Speckyboy

Remote Data Blocks plugin simplifies fetching and displaying external data in WordPress from services like Google Sheets and Airtable.
DevOps
fromInfoQ
22 hours ago

Etsy Migrates 1000-Shard, 425 TB MySQL Sharding Architecture to Vitess

Etsy migrated its MySQL sharding infrastructure to Vitess, enhancing data management and enabling resharding capabilities.
#snowflake
Django
fromMedium
1 week ago

Snowflake Supports Directory Imports

Easier package imports into Snowflake functions and procedures from stage directories and SnowGit directories streamline development and deployment.
Artificial intelligence
fromTheregister
2 weeks ago

Snowflake's ongoing pitch: bring AI to data, not vice versa

Snowflake is enhancing its platform for AI integration through strategic partnerships and acquisitions, focusing on customer ROI and data management efficiency.
#structured-data
Data science
fromAol
5 days ago

Demystifying structured data: How to speak an LLM's native language

Structured data is essential for LLMs to accurately interpret and rank online content, enhancing search visibility and user engagement.
Data science
fromAol
5 days ago

Demystifying structured data: How to speak an LLM's native language

Structured data is essential for LLMs to accurately interpret and rank online content, enhancing search visibility and user engagement.
Data science
fromAol
5 days ago

Demystifying structured data: How to speak an LLM's native language

Structured data is essential for LLMs to accurately interpret and rank online content, enhancing search visibility and user engagement.
Data science
fromAol
5 days ago

Demystifying structured data: How to speak an LLM's native language

Structured data is essential for LLMs to accurately interpret and rank online content, enhancing search visibility and user engagement.
Artificial intelligence
fromComputerworld
3 days ago

AI often doesn't deliver ROI for IT departments either

Only 28% of AI projects in infrastructure and operations achieve meaningful ROI, with many failing due to unrealistic expectations and skills gaps.
fromInfoWorld
2 days ago

Bringing databases and Kubernetes together

Automating Kubernetes workloads with Operators can provide the same level of functionality as DBaaS, while still avoiding lock-in to a specific provider.
DevOps
Data science
fromTheregister
3 days ago

UK National Data Library plan needs work, study finds

The UK's National Data Library needs improved dataset accessibility to support AI development and meaningful analysis.
fromSilicon Canals
4 days ago

When militaries share data centers with banks: how Gulf strikes exposed a structural flaw in global cloud infrastructure - Silicon Canals

When civilian banks, logistics platforms, and payment processors share physical data center infrastructure with military AI systems, those facilities become legitimate military targets under international humanitarian law - and the civilian services housed inside lose their legal protection.
Information security
#aws
DevOps
fromInfoWorld
1 day ago

AWS targets AI agent sprawl with new Bedrock Agent Registry

AWS introduces Agent Registry to help enterprises manage and govern AI agents effectively.
DevOps
fromTheregister
2 days ago

AWS: Agents shouldn't be secret, so we built a registry

AWS Agent Registry enhances visibility and control over AI agents in corporate environments.
DevOps
fromTheregister
2 days ago

AWS put a file system on S3; I stress-tested it

AWS S3 Files allows mounting S3 buckets as NFS shares, providing solid conflict resolution and cost-effective storage options.
fromTechzine Global
1 day ago

Cisco strengthens AI observability Splunk by acquiring Galileo

Galileo provides AI teams with tools to evaluate the quality of AI outputs, detect errors before they reach end users, and continuously improve the behavior of AI agents in production.
DevOps
DevOps
fromInfoQ
2 days ago

Uber's Hive Federation Decentralizes 16K Datasets and 10+ PB for Zero-Downtime Analytics at Scale

Uber redesigned its Hive data warehouse to decentralize datasets, enhancing scalability, security, and operational autonomy for teams.
Business intelligence
fromTheregister
1 week ago

Microsoft Fabric Database Hub dubbed 'partial' solution

Microsoft's Fabric Database Hub offers a centralized management solution for its database services but lacks support for non-Microsoft databases.
#nutanix
DevOps
fromTechzine Global
2 days ago

Nutanix won't give AI free rein: infrastructure remains a human endeavor

Nutanix focuses on facilitating AI workloads while maintaining human oversight in IT management, emphasizing minimal changes for infrastructure administrators.
DevOps
fromTechzine Global
4 days ago

As IT complexity escalates, Nutanix fights back

Nutanix is prioritizing flexibility and aims to be a leading agentic AI platform amidst external IT developments.
fromMedium
1 month ago

Real-Time Data Validation in Healthcare Streaming: Building Custom Schema Registry Patterns with...

In a single streaming pipeline, you might be processing HL7 FHIR messages with frequent specification updates, claims data following various payer-specific formats, provider directory information with inconsistent taxonomies, and patient demographics with privacy redaction requirements. Our member eligibility stream processes roughly 50,000 records per minute during peak enrollment periods.
Healthcare
DevOps
fromFortune
2 days ago

The digital sovereignty dilemma is a false choice - here's how enterprises can have both | Fortune

Organizations must ensure digital sovereignty by balancing local control with global technology access to remain resilient and competitive.
Data science
fromInfoQ
2 weeks ago

Data Mesh in Action: A Journey From Ideation to Implementation

Data mesh is essential for organizations to develop independent data analytics capabilities after separation from larger parent companies.
DevOps
fromInfoWorld
3 days ago

AWS turns its S3 storage service into a file system for AI agents

S3 Files simplifies access to Amazon S3, enhancing its role as a primary data layer for AI and modern applications.
#ai-automation
Artificial intelligence
fromTechzine Global
3 weeks ago

Snowflake's Project SnowWork targets autonomous enterprise AI

Snowflake launches Project SnowWork, an autonomous AI interface that performs enterprise tasks like forecasts and reports without data team involvement, expanding from backend infrastructure to front-office productivity tool.
fromTheregister
4 weeks ago
Artificial intelligence

Perplexity: Everything is Computer

Perplexity launches Computer for Enterprise, an AI orchestration service that automates business tasks across integrated cloud applications like Gmail, Slack, and Salesforce.
Artificial intelligence
fromTechzine Global
3 weeks ago

Snowflake's Project SnowWork targets autonomous enterprise AI

Snowflake launches Project SnowWork, an autonomous AI interface that performs enterprise tasks like forecasts and reports without data team involvement, expanding from backend infrastructure to front-office productivity tool.
Artificial intelligence
fromTheregister
4 weeks ago

Perplexity: Everything is Computer

Perplexity launches Computer for Enterprise, an AI orchestration service that automates business tasks across integrated cloud applications like Gmail, Slack, and Salesforce.
fromTechzine Global
3 days ago

AWS S3 buckets now support file systems

S3 Files is built on Amazon EFS and automatically translates file system operations into S3 requests, allowing applications to work with S3 data without code changes.
DevOps
Business intelligence
fromInfoWorld
3 weeks ago

Snowflake's new 'autonomous' AI layer aims to do the work, not just answer questions

Project SnowWork is Snowflake's autonomous AI layer that automates data analysis tasks like forecasting, churn analysis, and report generation without requiring data team intervention.
fromInfoQ
1 month ago

Hybrid Cloud Data at Uber: How Engineers Solved Extreme-Scale Replication Challenges

Uber's engineering team has transformed its data replication platform to move petabytes of data daily across hybrid cloud and on-premise data lakes, addressing scaling challenges caused by rapidly growing workloads. Built on Hadoop's open-source Distcp framework, the platform now handles over one petabyte of daily replication and hundreds of thousands of jobs with improved speed, reliability, and observability.
Miscellaneous
Data science
fromMedium
3 weeks ago

Building Consistent Data Foundations at Scale

Building consistent data foundations through intentional architecture, engineering, and governance is essential to prevent fragmentation, support AI adoption, ensure regulatory compliance, and enable reliable organizational decisions at scale.
Business intelligence
fromTheregister
3 weeks ago

Microsoft promises multi database wrangling hub on Fabric

Microsoft launched Database Hub, a unified management tool within Fabric that consolidates multiple database services across on-premises, PaaS, and SaaS environments with AI-assisted capabilities.
DevOps
fromInfoQ
1 week ago

Replacing Database Sequences at Scale Without Breaking 100+ Services

Validating requirements can simplify complex problems, and embedding sequence generation reduces network calls, enhancing performance and reliability.
#observability
DevOps
fromTechzine Global
1 week ago

Observability warehouses, the next structural evolution for telemetry

Observability is essential for real-time insights in cloud systems, helping to reduce downtime and improve performance.
DevOps
fromTechzine Global
1 week ago

Observability warehouses, the next structural evolution for telemetry

Observability is essential for real-time insights in cloud systems, helping to reduce downtime and improve performance.
Data science
fromMedium
1 month ago

Migrating to the Lakehouse Without the Big Bang: An Incremental Approach

Query federation enables safe, incremental lakehouse migration by allowing simultaneous queries across legacy warehouses and new lakehouse systems without risky big bang cutover approaches.
Business intelligence
fromInfoWorld
4 weeks ago

Why Postgres has won as the de facto database: Today and for the agentic future

Leading enterprises achieve 5x ROI by adopting open source databases like PostgreSQL to unify structured and unstructured data for agentic AI, with 81% of successful enterprises committed to open source strategies.
fromDbmaestro
4 years ago

5 Pillars of Database Compliance Automation |

There is a growing emphasis on database compliance today due to the stricter enforcement of compliance rules and regulations to safeguard user privacy. For example, GDPR fines can reach £17.5 million or 4% of annual global turnover (the higher of the two applies). Besides the direct monetary implications, companies also need to prioritize compliance to protect their brand reputation and achieve growth.
EU data protection
fromInfoWorld
2 months ago

AI is changing the way we think about databases

Developers have spent the past decade trying to forget databases exist. Not literally, of course. We still store petabytes. But for the average developer, the database became an implementation detail; an essential but staid utility layer we worked hard not to think about. We abstracted it behind object-relational mappers (ORM). We wrapped it in APIs. We stuffed semi-structured objects into columns and told ourselves it was flexible.
Software development
Business intelligence
fromEntrepreneur
1 month ago

The Game-Changing Tech Saving Companies From Data Disasters

Combining Continuous Data Protection with AI capabilities enables businesses to achieve near-zero Recovery Point Objectives and minimal Recovery Time Objectives, preventing data loss and minimizing downtime.
DevOps
fromInfoQ
2 weeks ago

AWS Expands Aurora DSQL with Playground, New Tool Integrations, and Driver Connectors

Amazon Aurora DSQL introduces usability enhancements, including a browser-based playground and integrations with popular SQL tools for improved developer experience.
#mariadb-acquisition
Business intelligence
fromInfoWorld
1 month ago

MariaDB taps GridGain to keep pace with AI-driven data demands

MariaDB's acquisition of GridGain aims to create an integrated platform combining relational database reliability with in-memory computing speed to compete with hyperscaler offerings.
Business intelligence
fromInfoWorld
1 month ago

MariaDB taps GridGain to keep pace with AI-driven data demands

MariaDB's acquisition of GridGain aims to create an integrated platform combining relational database reliability with in-memory computing speed to compete with hyperscaler offerings.
Startup companies
fromInfoQ
2 months ago

Etleap Launches Iceberg Pipeline Platform to Simplify Enterprise Adoption of Apache Iceberg

Managed Iceberg pipeline platform unifies ingestion, transformation, orchestration, and table operations inside customers' VPCs, enabling enterprise Iceberg adoption without building custom stacks.
fromTechzine Global
2 months ago

4 steps to create a future-proof data infrastructure

A future-proof IT infrastructure is often positioned as a universal solution that can withstand any change. However, such a solution does not exist. Nevertheless, future-proofing is an important concept for IT leaders navigating continuous technological developments and security risks, all while ensuring that daily business operations continue. The challenge is finding a balance between reactive problem solving and proactive planning, because overlooking a change can cost your organization. So, how do you successfully prepare for the future without that one-size-fits-all solution?
Tech industry
Artificial intelligence
fromInfoWorld
1 month ago

Why AI requires rethinking the storage-compute divide

AI workloads require continuous processing of unstructured multimodal data, causing redundant data movement and transformation that wastes infrastructure costs and data scientist time.
DevOps
fromInfoWorld
3 weeks ago

Update your databases now to avoid data debt

Multiple major open source databases reach end-of-life in 2026, requiring teams to plan upgrades and migrations to avoid security risks and higher costs.
Data science
fromInfoWorld
1 month ago

The revenge of SQL: How a 50-year-old language reinvents itself

SQL has experienced a major comeback driven by SQLite in browsers, improved language tools, and PostgreSQL's jsonb type, making it both traditional and exciting for modern development.
DevOps
fromInfoWorld
3 weeks ago

Cloud-based LLMs risk enterprise stability

Enterprises must return to architectural resilience principles when adopting cloud-hosted LLMs to mitigate risks from increasingly common outages that cause widespread business disruption.
Software development
fromMedium
2 months ago

The Complete Database Scaling Playbook: From 1 to 10,000 Queries Per Second

Database scaling to 10,000 QPS requires staged architectural strategies timed to traffic thresholds to avoid outages or unnecessary cost.
Tech industry
fromTheregister
1 month ago

Oracle promises new approach to MySQL

Oracle commits to new engineering leadership, developer-focused features, greater transparency, and expanded community engagement to guide MySQL through 2026 and beyond.
fromInfoQ
2 months ago

Somtochi Onyekwere on Distributed Data Systems, Eventual Consistency and Conflict-free Replicated Data Types

In today's episode, I will be speaking with Somtochi Onyekwere, software engineer at Fly.io organization. We will discuss the recent developments in distributed data systems, especially topics like eventual consistency and how to achieve fast, eventually consistent replication across distributed nodes. We'll also talk about the conflict-free replicated data type data structures, also known as CRDTs and how they can help with conflict resolution when managing data in distributed data storage systems.
Software development
Information security
fromSecuritymagazine
2 months ago

Product Spotlight on Analytics

Taelor Sutherland is Associate Editor at Security magazine covering enterprise security, coordinating digital content, and holding a BA in English Literature from Agnes Scott College.
Software development
fromInfoWorld
2 months ago

4 self-contained databases for your apps

XAMPP provides a complete local web stack (MariaDB, Apache, PHP, Mercury SMTP, OpenSSL) while PostgreSQL can be run standalone or embedded via pgserver in Python.
DevOps
fromInfoQ
1 month ago

From Minutes to Seconds: Uber Boosts MySQL Cluster Uptime with Consensus Architecture

Uber redesigned MySQL infrastructure using Group Replication to reduce failover time from minutes to seconds while maintaining strong consistency across thousands of clusters.
#dynamodb
DevOps
fromInfoQ
1 month ago

Google BigQuery Previews Cross-Region SQL Queries for Distributed Data

BigQuery's global queries feature enables SQL queries across multiple geographic regions without data movement, eliminating ETL pipelines for distributed analytics.
fromFast Company
1 month ago

Beware of data hubris

Organizations are drowning in dashboards, KPIs, performance metrics, behavioral traces, biometric indicators, predictive scores, engagement rates, and AI-generated forecasts. We have more data than we know what to do with. We pretend that the mere presence of data guarantees clarity. It does not. That's data hubris—the arrogant belief that because something can be measured, it can be mastered.
Business intelligence
Data science
fromTechzine Global
1 month ago

Ataccama puts agentic data observability into platform core

Ataccama ONE introduces Agentic Data Observability technology to ensure high-quality, reliable data for AI systems while preventing autonomous errors and bias in regulated enterprises.
Software development
fromInfoWorld
2 months ago

Why your next microservices should be streaming SQL-driven

Streaming SQL with UDFs, materialized results, and ML/AI integrations enables continuous, stateful processing of event streams for microservices.
Artificial intelligence
fromMedium
2 months ago

Extracting AI-Ready Data From Organizational Documents

Poor document extraction corrupts retrieval; preserving document structure at ingestion produces reliable embeddings and trustworthy RAG outputs.
Data science
fromInfoWorld
2 months ago

Snowflake debuts Cortex Code, an AI agent that understands enterprise data context

Cortex Code enables developers to use natural language to build, optimize, and deploy governed, production-ready data pipelines, analytics, ML workloads, and AI agents.
Business intelligence
fromTechzine Global
2 months ago

ClickHouse, the open-source challenger to Snowflake and Databricks

ClickHouse is a high-performance columnar OLAP database rapidly adopted by AI and enterprise users, now valued at $15B and acquiring Langfuse.
fromNew Relic
3 months ago

The Power and Cost of Data Cardinality

The more attributes you add to your metrics, the more complex and valuable questions you can answer. Every additional attribute provides a new dimension for analysis and troubleshooting. For instance, adding an infrastructure attribute, such as region can help you determine if a performance issue is isolated to a specific geographic area or is widespread. Similarly, adding business context, like a store location attribute for an e-commerce platform, allows you to understand if an issue is specific to a particular set of stores
Data science
fromComputerWeekly.com
2 months ago

AI slop pushes data governance towards zero-trust models | Computer Weekly

Unverified and low quality data generated by artificial intelligence (AI) models - often known as AI slop - is forcing more security leaders to look to zero-trust models for data governance, with 50% of organisations likely to start adopting such policies by 2028, according to Gartner's seers. Currently, large language models (LLMs) are typically trained on data scraped - with or without permission - from the world wide web and other sources including books, research papers, and code repositories.
Artificial intelligence
Data science
fromInfoQ
2 months ago

Beyond the Warehouse: Why BigQuery Alone Won't Solve Your Data Problems

Data warehouses like BigQuery perform well initially but become slow, costly, and disorganized at scale, undermining low-latency operational use and innovation.
Artificial intelligence
fromTechzine Global
2 months ago

Snowflake launches Cortex Code agent for understanding data context

Cortex Code is an AI agent that converts complex data engineering, ML, and analytics tasks into natural-language workflows integrated into Snowflake and developer tools.
Artificial intelligence
fromTechRepublic
6 months ago

New AI Data 'Universal Translator' From Salesforce, Snowflake, Others

Snowflake and other firms created the Open Semantic Interchange to standardize semantics and enable interoperable data sharing among AI-enabled products, reducing semantic mismatches.
Data science
fromCIO
2 months ago

5 perspectives on modern data analytics

Data/business analytics is the top IT investment priority, yet analytics projects often fail due to poor data, vague objectives, and one-size-fits-all solutions.
Artificial intelligence
fromInfoQ
2 months ago

MongoDB Introduces Embedding and Reranking API on Atlas

MongoDB Atlas now offers an Embedding and Reranking API with Voyage AI models, enabling unified semantic search, automated embeddings, and integrated monitoring and billing.
Data science
fromDevOps.com
2 months ago

Why Data Contracts Need Apache Kafka and Apache Flink - DevOps.com

Data contracts formalize schemas, types, and quality constraints through early producer-consumer collaboration to prevent pipeline failures and reduce operational downtime.
fromDbmaestro
5 years ago

Database Delivery Automation in the Multi-Cloud World

The main advantage of going the Multi-Cloud way is that organizations can "put their eggs in different baskets" and be more versatile in their approach to how they do things. For example, they can mix it up and opt for a cloud-based Platform-as-a-Service (PaaS) solution when it comes to the database, while going the Software-as-a-Service (SaaS) route for their application endeavors.
DevOps
[ Load more ]