Monday, April 20, 2026
Home Blog

Markdown + Astro = ❤️

0


Markdown is a superb invention that lets us write much less markup. It additionally handles typographical issues like changing straight apostrophes (') to opening or closing quotes (' or ') for us.

Though Astro has built-in help for Markdown through .md recordsdata, I’d argue that your Markdown expertise could be enhanced in two methods:

  1. MDX
  2. Markdown Element

I’ve cowl these in depth in Sensible Astro: Content material Techniques.

We’re going to concentrate on MDX immediately.

MDX

MDX is a superset of Markdown. It enables you to use elements in Markdown and easy JSX along with all different Markdown options.

For Astro, you may also use elements from any frontend framework that you’ve put in. So you are able to do one thing like:

---
# Frontmatter...
---

import AstroComp from '@/elements/AstroComp.astro'
import SvelteComp from '@/elements/AstroComp.astro'

 ... 
 ... 

It may be an important substitute for content-heavy stuff as a result of it enables you to write markup like the next.

### Card Title Content material goes right here - Record - Of - Objects Second paragraph

Astro will convert the MDX into the next HTML:

Card Title

Content material goes right here

Second paragraph

Discover what I did above:

  • I used ## as a substitute of a full h2 tag.
  • I used - as a substitute of and
  • to indicate lists.
  • I didn’t want any paragraph tags.

Writing the entire thing in HTML immediately would have been considerably of a ache.

Putting in MDX

Astro of us have constructed an integration for MDX so it’s easy-peasy so as to add it to your mission. Simply observe these directions.

Three Principal Methods to Use MDX

These strategies additionally work with customary Markdown recordsdata.

  1. Import it immediately into an Astro file
  2. By means of content material collections
  3. By means of a structure

Import it Straight

The primary approach is just to import your MDX file and use it immediately as a element.

---
import MDXComp from '../elements/MDXComp.mdx'
---

Due to this, MDX can kinda operate like a partial.

By means of Content material Collections

First, you feed your MDX right into a content material assortment. Notice that it’s a must to add the mdx sample to your glob right here.

Import it immediately

The primary approach is just to import your MDX file and use it immediately as a element.

// src/content material.config.js
import { defineCollection } from 'astro:content material';
import { glob } from 'astro/loaders';

const weblog = defineCollection({
  loader: glob({ sample: "**/*.{md,mdx}", base: "./src/weblog" }),
});

export const collections = { weblog };

You then retrieve the MDX file from the content material assortment.

---
import { getEntry, render } from 'astro:content material'
const { slug } = Astro.props
const put up = await getEntry('weblog', slug)
const { Content material } = await render(put up)
---

As you’re doing this, you may move elements into the MDX recordsdata so that you don’t must import them individually in each file.

For instance, right here’s how I might move the Picture element from Splendid Labz into every of my MDX recordsdata.

---
import { Picture } from '@splendidlabz/astro'
// ...
const { Content material } = await render(put up)
const elements = { Picture }
---

In my MDX recordsdata, I can now use Picture with out importing it.

https://css-tricks.com/markdown-astro/...

Use a Structure

Lastly, you may add a structure frontmatter within the MDX file.

---
title: Weblog Submit Title
structure: @/layouts/MDX.astro
---

This structure frontmatter ought to level to an Astro file.

In that file:

  • You possibly can extract frontmatter properties from Astro.props.content material.
  • The MDX content material could be rendered with .
---
import Base from './Base.astro'
const props = Astro.props.content material
const { title } = props
---


  
  

Caveats

Formatting and Linting Fails

ESLint and Prettier don’t format MDX recordsdata properly, so that you’ll find yourself manually indenting most of your markup.

That is fantastic for small quantities of markup. However you probably have a number of them… then the Markdown Element will probably be a a lot better alternative.

Extra on that in one other upcoming put up.

The Astro RSS integration doesn’t help MDX recordsdata out of the field.

Fortunately, this may be dealt with simply with Astro containers. I’ll present you ways to do that in Sensible Astro.

Taking it Additional

I’ve been constructing with Astro for 3+ years, and I stored operating into the identical friction factors on content-heavy websites: weblog pages, tag pages, pagination, and folder buildings that get messy over time.

So I constructed Sensible Astro: Content material Techniques, 7 ready-to-use options for Astro content material workflows (MDX is only one of them). You get each the code and the pondering behind it.

If you would like a cleaner, calmer content material workflow, test it out.

I additionally write about Astro Patterns and Utilizing Tailwind + CSS collectively on my weblog. Come by and say hello!

The way to Crawl an Total Documentation Web site with Olostep

0



Picture by Creator

 

Introduction

 
Internet crawling is the method of mechanically visiting internet pages, following hyperlinks, and accumulating content material from an internet site in a structured manner. It’s generally used to collect massive quantities of knowledge from documentation websites, articles, information bases, and different internet sources.

Crawling a complete web site after which changing that content material right into a format that an AI agent can really use will not be so simple as it sounds. Documentation websites usually comprise nested pages, repeated navigation hyperlinks, boilerplate content material, and inconsistent web page buildings. On high of that, the extracted content material must be cleaned, organized, and saved in a manner that’s helpful for downstream AI workflows reminiscent of retrieval, question-answering, or agent-based techniques.

On this information, we’ll be taught why to make use of Olostep as an alternative of Scrapy or Selenium, arrange all the pieces wanted for the online crawling undertaking, write a easy crawling script to scrape a documentation web site, and eventually create a frontend utilizing Gradio in order that anybody can present a hyperlink and different arguments to crawl web site pages.

 

Selecting Olostep Over Scrapy or Selenium

 
Scrapy is highly effective, however it’s constructed as a full scraping framework. That’s helpful while you need deep management, but it surely additionally means extra setup and extra engineering work.

Selenium is healthier identified for browser automation. It’s helpful for interacting with JavaScript-heavy pages, however it’s not actually designed as a documentation crawling workflow by itself.

With Olostep, the pitch is much more direct: search, crawl, scrape, and construction internet information by way of one utility programming interface (API), with assist for LLM-friendly outputs like Markdown, textual content, HTML, and structured JSON. Which means you wouldn’t have to manually sew collectively items for discovery, extraction, formatting, and downstream AI use in the identical manner.

For documentation websites, that may give you a a lot sooner path from URL to usable content material since you are spending much less time constructing the crawling stack your self and extra time working with the content material you really want.

 

Putting in the Packages and Setting the API Key

 
First, set up the Python packages used on this undertaking. The official Olostep software program improvement package (SDK) requires Python 3.11 or later.

pip set up olostep python-dotenv tqdm

 

These packages deal with the primary elements of the workflow:

  • olostep connects your script to the Olostep API
  • python-dotenv masses your API key from a .env file
  • tqdm provides a progress bar so you’ll be able to monitor saved pages

Subsequent, create a free Olostep account, open the dashboard, and generate an API key from the API keys web page. Olostep’s official docs and integrations level customers to the dashboard for API key setup.

 

Olostep Dashboard API Key Setup

 

Then create a .env file in your undertaking folder:

OLOSTEP_API_KEY=your_real_api_key_here

 

This retains your API key separate out of your Python code, which is a cleaner and safer technique to handle credentials.

 

Creating the Crawler Script

 
On this a part of the undertaking, we’ll construct the Python script that crawls a documentation web site, extracts every web page in Markdown format, cleans the content material, and saves it regionally as particular person information. We’ll create the undertaking folder, add a Python file, after which write the code step-by-step so it’s simple to comply with and check.

First, create a undertaking folder in your crawler. Inside that folder, create a brand new Python file named crawl_docs_with_olostep.py.

Now we’ll add the code to this file one part at a time. This makes it simpler to know what every a part of the script does and the way the complete crawler works collectively.

 

// Defining the Crawl Settings

Begin by importing the required libraries. Then outline the primary crawl settings, such because the beginning URL, crawl depth, web page restrict, embody and exclude guidelines, and the output folder the place the Markdown information will likely be saved. These values management how a lot of the documentation web site will get crawled and the place the outcomes are saved.

import os
import re
from pathlib import Path
from urllib.parse import urlparse

from dotenv import load_dotenv
from tqdm import tqdm
from olostep import Olostep

START_URL = "https://docs.olostep.com/"
MAX_PAGES = 10
MAX_DEPTH = 1

INCLUDE_URLS = [
    "/**"
]

EXCLUDE_URLS = []

OUTPUT_DIR = Path("olostep_docs_output")

 

// Making a Helper Operate to Generate Protected File Names

Every crawled web page must be saved as its personal Markdown file. To try this, we want a helper perform that converts a URL right into a clear and filesystem-safe file title. This avoids issues with slashes, symbols, and different characters that don’t work effectively in file names.

def slugify_url(url: str) -> str:
    parsed = urlparse(url)
    path = parsed.path.strip("https://www.kdnuggets.com/")

    if not path:
        path = "index"

    filename = re.sub(r"[^a-zA-Z0-9/_-]+", "-", path)
    filename = filename.exchange("https://www.kdnuggets.com/", "__").strip("-_")

    return f"{filename or 'web page'}.md"

 

// Making a Helper Operate to Save Markdown Recordsdata

Subsequent, add helper capabilities to course of the extracted content material earlier than saving it.

The primary perform cleans the Markdown by eradicating further interface textual content, repeated clean traces, and undesirable web page parts reminiscent of suggestions prompts. This helps preserve the saved information centered on the precise documentation content material.

def clean_markdown(markdown: str) -> str:
    textual content = markdown.exchange("rn", "n").strip()
    textual content = re.sub(r"[s*u200b?s*](#.*?)", "", textual content, flags=re.DOTALL)

    traces = [line.rstrip() for line in text.splitlines()]

    start_index = 0
    for index in vary(len(traces) - 1):
        title = traces[index].strip()
        underline = traces[index + 1].strip()
        if title and underline and set(underline) == {"="}:
            start_index = index
            break
    else:
        for index, line in enumerate(traces):
            if line.lstrip().startswith("# "):
                start_index = index
                break

    traces = traces[start_index:]

    for index, line in enumerate(traces):
        if line.strip() == "Was this web page useful?":
            traces = traces[:index]
            break

    cleaned_lines: record[str] = []
    for line in traces:
        stripped = line.strip()
        if stripped in {"Copy web page", "YesNo", "⌘I"}:
            proceed
        if not stripped and cleaned_lines and never cleaned_lines[-1]:
            proceed
        cleaned_lines.append(line)

    return "n".be a part of(cleaned_lines).strip()

 

The second perform saves the cleaned Markdown into the output folder and provides the supply URL on the high of the file. There may be additionally a small helper perform to clear outdated Markdown information earlier than saving a brand new crawl consequence.

def save_markdown(output_dir: Path, url: str, markdown: str) -> None:
    output_dir.mkdir(dad and mom=True, exist_ok=True)
    filepath = output_dir / slugify_url(url)

    content material = f"""---
source_url: {url}
---

{markdown}
"""
    filepath.write_text(content material, encoding="utf-8")

 

There may be additionally a small helper perform to clear outdated Markdown information earlier than saving a brand new crawl consequence.

def clear_output_dir(output_dir: Path) -> None:
    if not output_dir.exists():
        return

    for filepath in output_dir.glob("*.md"):
        filepath.unlink()

 

// Creating the Predominant Crawler Logic

That is the primary a part of the script. It masses the API key from the .env file, creates the Olostep consumer, begins the crawl, waits for it to complete, retrieves every crawled web page as Markdown, cleans the content material, and saves it regionally.

This part ties all the pieces collectively and turns the person helper capabilities right into a working documentation crawler.

def primary() -> None:
    load_dotenv()
    api_key = os.getenv("OLOSTEP_API_KEY")

    if not api_key:
        elevate RuntimeError("Lacking OLOSTEP_API_KEY in your .env file.")

    consumer = Olostep(api_key=api_key)

    crawl = consumer.crawls.create(
        start_url=START_URL,
        max_pages=MAX_PAGES,
        max_depth=MAX_DEPTH,
        include_urls=INCLUDE_URLS,
        exclude_urls=EXCLUDE_URLS,
        include_external=False,
        include_subdomain=False,
        follow_robots_txt=True,
    )

    print(f"Began crawl: {crawl.id}")
    crawl.wait_till_done(check_every_n_secs=5)

    pages = record(crawl.pages())
    clear_output_dir(OUTPUT_DIR)

    for web page in tqdm(pages, desc="Saving pages"):
        strive:
            content material = web page.retrieve(["markdown"])
            markdown = getattr(content material, "markdown_content", None)

            if markdown:
                save_markdown(OUTPUT_DIR, web page.url, clean_markdown(markdown))
        besides Exception as exc:
            print(f"Didn't retrieve {web page.url}: {exc}")

    print(f"Carried out. Recordsdata saved in: {OUTPUT_DIR.resolve()}")


if __name__ == "__main__":
    primary()

 

Observe: The complete script is offered right here: kingabzpro/web-crawl-olostep, an online crawler and starter internet app constructed with Olostep.

 

// Testing the Internet Crawling Script

As soon as the script is full, run it out of your terminal:

python crawl_docs_with_olostep.py

 

Because the script runs, you will note the crawler course of the pages and save them one after the other as Markdown information in your output folder.

 

Olostep Crawler Terminal Progress

 

After the crawl finishes, open the saved information to test the extracted content material. It is best to see clear, readable Markdown variations of the documentation pages.

 

Clean Markdown Output Example

 

At that time, your documentation content material is able to use in AI workflows reminiscent of search, retrieval, or agent-based techniques.

 

Creating the Olostep Internet Crawling Internet Software

 
On this a part of the undertaking, we’ll construct a easy internet utility on high of the crawler script. As a substitute of enhancing the Python file each time, this utility provides you a better technique to enter a documentation URL, select crawl settings, run the crawl, and preview the saved Markdown information in a single place.

The frontend code for this utility is offered in app.py within the repository: web-crawl-olostep/app.py.

This utility does just a few helpful issues:

  • Helps you to enter a beginning URL for the crawl
  • Helps you to set the utmost variety of pages to crawl
  • Helps you to management crawl depth
  • Helps you to add embody and exclude URL patterns
  • Runs the backend crawler instantly from the interface
  • Saves the crawled pages right into a folder primarily based on the URL
  • Reveals all saved Markdown information in a dropdown
  • Previews every Markdown file instantly inside the applying
  • Helps you to clear earlier crawl outcomes with one button

To start out the applying, run:

 

After that, Gradio will begin an area internet server and supply a hyperlink like this:

* Operating on native URL: http://127.0.0.1:7860
* To create a public hyperlink, set `share=True` in `launch()`.

 

As soon as the applying is operating, open the native URL in your browser. In our instance, we gave the applying the Claude Code documentation URL and requested it to crawl 50 pages with a depth of 5.

 

Gradio Interface for Documentation Crawling

 

If you click on Run Crawl, the applying passes your settings to the backend crawler and begins the crawl. Within the terminal, you’ll be able to watch the progress as pages are crawled and saved one after the other.

 

Crawler Terminal Output

 

After the crawl finishes, the output folder will comprise the saved Markdown information. On this instance, you’d see that fifty information have been added.

 

Saved Markdown Files in Output Folder

 

The dropdown within the utility is then up to date mechanically, so you’ll be able to open any saved file and preview it instantly within the internet interface as correctly formatted Markdown.

 

Markdown Preview in Gradio Application

 

This makes the crawler a lot simpler to make use of. As a substitute of adjusting values in code each time, you’ll be able to check totally different documentation websites and crawl settings by way of a easy interface. That additionally makes the undertaking simpler to share with different individuals who could not need to work instantly in Python.

 

Ultimate Takeaway

 
Internet crawling will not be solely about accumulating pages from an internet site. The actual problem is popping that content material into clear, structured information that an AI system can really use. On this undertaking, we used a easy Python script and a Gradio utility to make that course of a lot simpler.

Simply as importantly, the workflow is quick sufficient for actual use. In our instance, crawling 50 pages with a depth of 5 took solely round 50 seconds, which exhibits you could put together documentation information shortly with out constructing a heavy pipeline.

This setup may transcend a one-time crawl. You possibly can schedule it to run day by day with cron or Job Scheduler, and even replace solely the pages which have modified. That retains your documentation recent whereas utilizing solely a small variety of credit.

For groups that want this sort of workflow to make enterprise sense, Olostep is constructed with that in thoughts. It’s considerably extra reasonably priced than constructing or sustaining an inside crawling resolution, and at the very least 50% cheaper than comparable options in the marketplace.

As your utilization grows, the associated fee per request continues to lower, which makes it a sensible alternative for bigger documentation pipelines. That mixture of reliability, scalability, and powerful unit economics is why a few of the fastest-growing AI-native startups depend on Olostep to energy their information infrastructure.
 
 

Abid Ali Awan (@1abidaliawan) is an authorized information scientist skilled who loves constructing machine studying fashions. At the moment, he’s specializing in content material creation and writing technical blogs on machine studying and information science applied sciences. Abid holds a Grasp’s diploma in expertise administration and a bachelor’s diploma in telecommunication engineering. His imaginative and prescient is to construct an AI product utilizing a graph neural community for college students battling psychological sickness.

OpenAI Scales Trusted Entry for Cyber Protection With GPT-5.4-Cyber: a Fantastic-Tuned Mannequin Constructed for Verified Safety Defenders


Cybersecurity has at all times had a dual-use downside: the identical technical information that helps defenders discover vulnerabilities may assist attackers exploit them. For AI methods, that rigidity is sharper than ever. Restrictions meant to forestall hurt have traditionally created friction for good-faith safety work, and it may be genuinely troublesome to inform whether or not any specific cyber motion is meant for defensive utilization or to trigger hurt. OpenAI is now proposing a concrete structural answer to that downside: verified identification, tiered entry, and a purpose-built mannequin for defenders.

OpenAI crew introduced that it’s scaling up its Trusted Entry for Cyber (TAC) program to hundreds of verified particular person defenders and tons of of groups liable for defending vital software program. The primary focus of this enlargement is the introduction of GPT-5.4-Cyber, a variant of GPT-5.4 fine-tuned particularly for defensive cybersecurity use circumstances.

What Is GPT-5.4-Cyber and How Does It Differ From Commonplace Fashions?

Should you’re an AI engineer or information scientist who has labored with giant language fashions on safety duties, you’re possible accustomed to the irritating expertise of a mannequin refusing to investigate a chunk of malware or clarify how a buffer overflow works — even in a clearly research-oriented context. GPT-5.4-Cyber is designed to get rid of that friction for verified customers.

Not like normal GPT-5.4, which applies blanket refusals to many dual-use safety queries, GPT-5.4-Cyber is described by OpenAI as ‘cyber-permissive’ — that means it has a intentionally decrease refusal threshold for prompts that serve a authentic defensive objective. That features binary reverse engineering, enabling safety professionals to investigate compiled software program for malware potential, vulnerabilities, and safety robustness with out entry to the supply code.

Binary reverse engineering with out supply code is a major functionality unlock. In observe, defenders routinely want to investigate closed-source binaries — firmware on embedded gadgets, third-party libraries, or suspected malware samples — with out accessing the unique code. That mannequin was described as a GPT-5.4 variant purposely fine-tuned for added cyber capabilities, with fewer functionality restrictions and help for superior defensive workflows together with binary reverse engineering with out supply code.

There are additionally exhausting limits. Customers with trusted entry should nonetheless abide by OpenAI’s Utilization Insurance policies and Phrases of Use. The strategy is designed to scale back friction for defenders whereas stopping prohibited habits, together with information exfiltration, malware creation or deployment, and damaging or unauthorized testing. This distinction issues: TAC lowers the refusal boundary for authentic work, however doesn’t droop coverage for any consumer.

There are additionally deployment constraints. Use in zero-data-retention environments is restricted, provided that OpenAI has much less visibility into the consumer, surroundings, and intent in these configurations — a tradeoff the corporate frames as a mandatory management floor in a tiered-access mannequin. For dev groups accustomed to operating API calls in Zero-Knowledge-Retention mode, this is a crucial implementation constraint to plan round earlier than constructing pipelines on high of GPT-5.4-Cyber.

The Tiered Entry Framework: How TAC Truly Works

TAC is just not a checkbox function — it’s an identity-and-trust-based entry framework with a number of tiers. Understanding the construction issues when you or your group plans to combine these capabilities.

The entry course of runs via two paths. Particular person customers can confirm their identification at chatgpt.com/cyber. Enterprises can request trusted entry for his or her crew via an OpenAI consultant. Prospects authorized via both path acquire entry to mannequin variations with lowered friction round safeguards that may in any other case set off on dual-use cyber exercise. Authorized makes use of embrace safety training, defensive programming, and accountable vulnerability analysis. TAC clients who need to go additional and authenticate as cyber defenders can categorical curiosity in extra entry tiers, together with GPT-5.4-Cyber. Deployment of the extra permissive mannequin is beginning with a restricted, iterative rollout to vetted safety distributors, organizations, and researchers.

Which means OpenAI is now drawing not less than three sensible traces as a substitute of 1: there’s baseline entry to common fashions; there’s trusted entry to present fashions with much less unintentional friction for authentic safety work; and there’s a increased tier of extra permissive, extra specialised entry for vetted defenders who can justify it.

The framework is grounded in three specific ideas. The first is democratized entry: utilizing goal standards and strategies, together with robust KYC and identification verification, to find out who can entry extra superior capabilities, with the purpose of creating these capabilities accessible to authentic actors of all sizes, together with these defending vital infrastructure and public companies. The second is iterative deployment — OpenAI updates fashions and security methods because it learns extra about the advantages and dangers of particular variations, together with bettering resilience to jailbreaks and adversarial assaults. The third is ecosystem resilience, which incorporates focused grants, contributions to open-source safety initiatives, and instruments like Codex Safety.

How the Security Stack Is Constructed: From GPT-5.2 to GPT-5.4-Cyber

It’s value understanding how OpenAI has structured its security structure throughout mannequin variations — as a result of TAC is constructed on high of that structure, not as a substitute of it.

OpenAI started cyber-specific security coaching with GPT-5.2, then expanded it with extra safeguards via GPT-5.3-Codex and GPT-5.4. A vital milestone in that development: GPT-5.3-Codex is the primary mannequin OpenAI is treating as Excessive cybersecurity functionality underneath its Preparedness Framework, which requires extra safeguards. These safeguards embrace coaching the mannequin to refuse clearly malicious requests like stealing credentials.

The Preparedness Framework is OpenAI’s inner analysis rubric for classifying how harmful a given functionality stage may very well be. Reaching ‘Excessive’ underneath that framework is what triggered the complete cybersecurity security stack being deployed — not simply model-level coaching, however an extra automated monitoring layer. Along with security coaching, automated classifier-based screens detect indicators of suspicious cyber exercise and route high-risk visitors to a much less cyber-capable mannequin, GPT-5.2. In different phrases, if a request appears to be like suspicious sufficient to exceed a threshold, the platform doesn’t simply refuse — it silently reroutes the visitors to a safer fallback mannequin. This can be a key architectural element: security is enforced not solely inside mannequin weights, but additionally on the infrastructure routing layer.

GPT-5.4-Cyber extends this stack additional upward — extra permissive for verified defenders, however wrapped in stronger identification and deployment controls to compensate.

Key Takeaways

  • TAC is an access-control answer, not only a mannequin launch. OpenAI’s Trusted Entry for Cyber program makes use of verified identification, belief indicators, and tiered entry to find out who will get enhanced cyber capabilities — shifting the security boundary away from prompt-level refusal filters towards a full deployment structure.
  • GPT-5.4-Cyber is purpose-built for defenders, not common customers. It’s a fine-tuned variant of GPT-5.4 with a intentionally decrease refusal boundary for authentic safety work, together with binary reverse engineering with out supply code — a functionality that instantly addresses how actual incident response and malware triage truly occur.
  • Security is enforced in layers, not simply within the mannequin weights. GPT-5.3-Codex — the primary mannequin categorized as “Excessive” cyber functionality underneath OpenAI’s Preparedness Framework — launched automated classifier-based screens that silently reroute high-risk visitors to a much less succesful fallback mannequin (GPT-5.2), that means the security stack lives on the infrastructure stage too.
  • Trusted entry doesn’t droop the principles. No matter tier, information exfiltration, malware creation or deployment, and damaging or unauthorized testing stay hard-prohibited behaviors for each consumer — TAC reduces friction for defenders, it doesn’t grant a coverage exception.

Take a look at the Technical particulars right here. Additionally, be at liberty to observe us on Twitter and don’t neglect to hitch our 130k+ ML SubReddit and Subscribe to our Publication. Wait! are you on telegram? now you possibly can be a part of us on telegram as effectively.

Have to associate with us for selling your GitHub Repo OR Hugging Face Web page OR Product Launch OR Webinar and so forth.? Join with us


Michal Sutter is an information science skilled with a Grasp of Science in Knowledge Science from the College of Padova. With a stable basis in statistical evaluation, machine studying, and information engineering, Michal excels at reworking advanced datasets into actionable insights.

After 200 years scientists lastly crack the “dolomite downside”

0


For greater than two centuries, scientists tried and didn’t develop dolomite within the lab beneath circumstances thought to match the way it varieties in nature. A current research has lastly modified that. Researchers from the College of Michigan and Hokkaido College in Sapporo, Japan succeeded by creating a brand new principle based mostly on detailed atomic simulations.

Their work solves a long-standing geological puzzle often known as the “Dolomite Drawback.” Dolomite is a widespread mineral present in iconic places such because the Dolomite mountains in Italy, Niagara Falls and Utah’s Hoodoos. It’s ample in rocks older than 100 million years, but it’s not often seen forming in more moderen environments.

“If we perceive how dolomite grows in nature, we would be taught new methods to advertise the crystal progress of contemporary technological supplies,” mentioned Wenhao Solar, the Dow Early Profession Professor of Supplies Science and Engineering at U-M and the corresponding writer of the paper revealed in Science.

Why Dolomite Development Is So Sluggish

The important thing breakthrough got here from understanding what disrupts dolomite because it varieties. In water, minerals sometimes develop as atoms connect in an orderly solution to the floor of a crystal. Dolomite behaves otherwise as a result of its construction is manufactured from alternating layers of calcium and magnesium.

Because the crystal grows, these two components usually connect randomly as a substitute of lining up accurately. This creates structural defects that block additional progress. The result’s a particularly gradual course of. At that charge, forming a single well-ordered layer of dolomite might take as much as 10 million years.

Nature’s Constructed-In Reset Mechanism

The researchers realized that these defects will not be everlasting. Atoms which might be misplaced are much less steady and extra more likely to dissolve when uncovered to water. In pure environments, cycles comparable to rainfall or tidal adjustments repeatedly wash away these flawed areas.

Over time, this course of clears the floor so new, correctly organized layers can type. As an alternative of taking thousands and thousands of years for a single layer, dolomite can progressively construct up in far shorter intervals. Over lengthy geological durations, this results in the big deposits seen in historical rock formations.

Simulating Crystal Development on the Atomic Degree

To check their concept, the group wanted to mannequin how atoms work together as dolomite varieties. This requires calculating the power concerned in numerous interactions between electrons and atoms, which is often extraordinarily demanding when it comes to computing energy.

Researchers at U-M’s Predictive Construction Supplies Science (PRISMS) Middle developed software program that simplifies this problem. It calculates the power for sure atomic preparations after which predicts others based mostly on the symmetry of the crystal construction.

“Our software program calculates the power for some atomic preparations, then extrapolates to foretell the energies for different preparations based mostly on the symmetry of the crystal construction,” mentioned Brian Puchala, one of many software program’s lead builders and an affiliate analysis scientist in U-M’s Division of Supplies Science and Engineering.

This method made it doable to simulate dolomite progress over timescales that replicate actual geological processes.

“Every atomic step would usually take over 5,000 CPU hours on a supercomputer. Now, we will do the identical calculation in 2 milliseconds on a desktop,” mentioned Joonsoo Kim, a doctoral scholar of supplies science and engineering and the research’s first writer.

Lab Experiment Confirms the Idea

Pure settings the place dolomite nonetheless varieties at the moment usually expertise cycles of flooding adopted by drying, which helps the group’s principle. Nonetheless, direct experimental proof was nonetheless wanted.

That proof got here from Yuki Kimura, a professor of supplies science at Hokkaido College, and Tomoya Yamazaki, a postdoctoral researcher in his lab. They used an uncommon property of transmission electron microscopes to recreate the method.

“Electron microscopes often use electron beams simply to picture samples,” Kimura mentioned. “Nonetheless, the beam may break up water, which makes acid that may trigger crystals to dissolve. Often that is unhealthy for imaging, however on this case, dissolution is precisely what we wished.”

The group positioned a small dolomite crystal in an answer containing calcium and magnesium. They then pulsed the electron beam 4,000 occasions over two hours, repeatedly dissolving the defects as they shaped.

After this course of, the crystal grew to about 100 nanometers, or roughly 250,000 occasions smaller than an inch. That progress represented round 300 layers of dolomite. Earlier experiments had by no means produced greater than 5 layers.

Implications for Fashionable Know-how

Fixing the Dolomite Drawback does greater than clarify a geological thriller. It additionally provides perception into management crystal progress in superior supplies utilized in trendy expertise.

“Previously, crystal growers who wished to make supplies with out defects would attempt to develop them actually slowly,” Solar mentioned. “Our principle reveals you can develop defect-free supplies rapidly, if you happen to periodically dissolve the defects away throughout progress.”

This idea might assist enhance the manufacturing of semiconductors, photo voltaic panels, batteries and different high-performance applied sciences.

The analysis was funded by the American Chemical Society PRF New Doctoral Investigator grant, the U.S. Division of Vitality and the Japanese Society for the Promotion of Science.

Job turnover by occupation – FlowingData

0


Some occupations have extra turnover than others. For instance, waiters and waitresses have a tendency to remain on the similar job for fewer years than these in supervisor roles. Within the chart beneath, see the median years spent on the similar job for various occupations.

Sorted from longest median tenure to least

That is primarily based on knowledge from the Present Inhabitants Survey from 2018 to 2024. The survey asks respondents what number of years they’ve been at their present job.

Firefighter and police supervisors have the longest tenures, whereas taxi drivers and motorized vehicle operators have the shortest tenures. This is smart. When you get a supervisor function, you extra probably need to stick with the job after working your means up. Alternatively, extra part-time jobs are inclined to have increased turnover.

The chart above exhibits median tenure together with the twenty fifth and seventy fifth percentiles, as a result of folks have been the identical job for various quantities of time. For instance, the median tenure for an online developer is 4 years and 11 months. The twenty fifth percentile (decrease tenure) is 3 years and 4 months, and the seventy fifth percentile (increased tenure) is 7 years and seven months.

A better median tends in direction of increased tenures general, however that isn’t at all times the case. Some folks with the identical occupation have been in the identical job for years or they may have simply began. There’s a vary inside every occupation.

Right here is how median tenure compares towards the unfold between twenty fifth and seventy fifth percentile (often known as interquartile vary) for every occupation.

Median tenure versus the unfold from twenty fifth to seventy fifth percentile (IQR)

 

Plotting on this means provides us 4 normal classes of occupations: excessive turnover, increased turnover however with some long-term employees, blended between quick and lengthy tenures, and lengthy tenures with much less unfold.

The postal service clerks have the widest unfold, which no less than appears proper, anecdotally talking. Jobs with common rotation, not too excessive or low, cluster across the center of the quadrants. Assume academics, workplace directors, and nurses.

With all of the current layoffs, I believed I’d see extra noise amongst developer-type jobs. They’re sort of within the decrease left quadrant, however perhaps we’ll see extra within the 2026 CPS launch.

Notes

The information relies on samples from the Present Inhabitants Survey from 2018 by means of 2024. Calculations are age-adjusted by occupation. A job tenure complement runs each two years, however the 2026 knowledge hasn’t been launched but. I downloaded microdata through IPUMS. I analyzed and processed the info in R and made the charts with D3.js.

Making a shiny as an instance the TWFE steady weights

0


Brantly Callaway, Andrew Goodman-Bacon and Pedro Sant’Anna (hereafter CBS) have a brand new article conditionally accepted at American Financial Assessment on steady remedy difference-in-differences. The paper has three major components:

  • an introduction and formalization of varied causal parameters associated to a remedy “dosage” acceptable for the difference-in-differences framework. This isn’t what I’ll talk about at the moment.

  • an introduction of an estimator that one can use to estimate a few of these causal parameters when you could have a steady remedy dosage and a difference-in-differences remedy project. That isn’t what I’ll talk about at the moment both.

  • a decomposition of conventional two-way mounted results (TWFE) estimator utilizing Frisch-Waugh-Lovell. That is what I’ll discuss at the moment.

There are 4 decompositions within the paper, and at the moment I’ll solely discuss certainly one of them: the degrees. Within the final substack on this, I labored by that one.

Right here it’s formally:

(beta^{textual content{twfe}} ;=; int_{d_L}^{d_U} underbrace{frac{(l – E[D]) cdot f_D(l)}{textual content{Var}(D)}}_{w^{textual content{lev}}(l)} cdot [m(l) – m(0)], dl)

And so with all that out of the best way, I’ll transfer on to the subsequent merchandise on my agenda which is to make use of Claude Code to create a shiny app that helps all of us higher perceive simply what’s going on in that formulation. I’ve a stroll by of that right here in a 35 minute video, which resulted on this shiny app that you need to use now that will help you higher perceive the TWFE decomposition and the whereabouts of unfavourable weights that it makes use of to calculate its coefficients. That is my first shiny app, and technically Claude Code made it so it isn’t even my shiny app, however I assumed it was enjoyable. I’m internet hosting it on my web site.

There it’s! My new CBS shiny app for the extent decomposition. Let me show you how to navigate it. First, discover there are 4 tables labeled Stage, Scaled degree, Causal response and Scaled 2×2. For those who click on on the others, they’ve a “Coming quickly” web page. The one working proper now’s the Stage one as that’s the one one to date we now have mentioned.

The photographs on the backside are from the deck:

Let me briefly stroll you thru it. There are a number of components in every of the decompositions and this slide illustrates them:

The issues on the left are the load components and the issues on the correct are the actual decomposition. So we’re doing degree weight, and subsequently we now have three components to it: the imply of the remedy dosage within the knowledge (E[D]), the variance, and the density. We combine over the doses utilizing the density formulation. Keep in mind our TWFE formulation from earlier:

(beta^{textual content{twfe}} ;=; int_{d_L}^{d_U} underbrace{frac{(l – E[D]) cdot f_D(l)}{textual content{Var}(D)}}_{w^{textual content{lev}}(l)} cdot [m(l) – m(0)], dl)

There are two items inside this weight — the load related to models which have zero dose, and people who have constructive dose. Look carefully at Desk 1, row 2, labeled “Ranges” and also you’ll see it.

Again to the components. Right here’s the primary one: the imply, E[D]:

Discover that on this case the imply dose (a tariff on this case) is 0.164. We are going to dangle on to that. Then there may be the variance. The variance recall is the sq. of the usual deviation measuring the unfold of the dose across the imply we simply recognized. And it is the same as 0.0202, or 0.02 for brief. The variance scales the load and seems within the denominator.

After which there may be the density, f_D(l). That is what we will probably be integrating over. The dosage is introduced as steady, but when it was multi-valued, we’d simply take weighted averages. When the density is excessive, there are various models with that worth and when it’s low, there are few. The truth that it’s excessive at zero within the picture under implies that there are a lot of models with zero dose. After we work with the density formulation, we calculate the density at a given dose, l. So if the dose was l=0.10, we simply calculate the density related to it.

And simply to make sure all of us see it, the x-axis all the time has the dose. The y-axis has the frequency, or depend, or variety of models, within the first two components, and the density within the third.

Alright, so let’s look once more actual fast on the shiny app web page for the extent decomposition. As an illustration, let’s say I need to know the density at 0.365. That’s, for industries with the tariff worth of 0.365, what’s the density worth? I merely slide the slider button on the highest left labeled “Dose l:” to 0.365. And spot, it routinely calculates it. The worth of the density at that time is 0.834. You’ll be able to see it on the far proper of the “Six components” formulation, and it’s also possible to see that it populates the picture labeled “Plug within the numbers on the chosen l”.

And so given the imply of 0.165, the variance of 0.0202 and a density worth (which not like the imply and variance just isn’t a continuing however modifications at every dose) of 0.834, we get:

(w^{degree}(0.834) = frac{(0.365−0.164)⋅0.834}{0.0202} = 0.833
)

And that’s it! That’s how the extent formulation works. You’ll be able to both transfer the slider left and proper to search out every business’s density worth, or you’ll be able to transfer your cursor over the density and that’ll present you the density related to that dosage. Both manner. However the formulation stays the identical.

Now essentially the most fascinating factor on this decomposition is the signal flip. Discover what occurs when the dosage is precisely equal to the imply. The imply, recall, is E[D], equalling 0.164:

(w^{degree}(0.164) = frac{(0.164−0.164)⋅0.834}{0.0202} = 0
)

So, when you could have a unit or set of models whose dose is precisely the imply, then the load on them curiously is zero. Now in our case, as a result of the remedy is a steady dose for which the likelihood of any actual worth is zero, I don’t present the load at that degree as there isn’t a one with precisely the imply dose. However, you’ll be able to see what occurs if you happen to transfer the slider left and proper — when industries have an above common dose, they’re positively weighted, however once they have a under common dose, they’re negatively weighted. For example this, I filmed myself a second time.

Within the above video, I really had a discrepancy which took me a bit to grasp. Claude had minimal dosage at 0.027 and a most dosage of 0.552. However the Gaussian kernel smeared a bit likelihood mass to the left of the smallest noticed dose. So the kernel put a teeny tiny little bit of density at 0.005. The load is technically computable however substantively empty — it’s a weight on a dose group with no industries in it.

So within the new one, you’ll nonetheless see the signal flip, however now there are vertical dashed strains on the lowest to highest doses. It isn’t within the video, however if you happen to watch the video, you’ll discover that’s the place I begin to notice it. I believe it was most likely effective to have left it there, however I felt just like the smoothing that the kernel perform was doing was complicated me, and subsequently I modified it. The highest panel makes use of a kernel smoother of 0.005, however the backside one was a bit underneath the most important kernel smoothing worth doable. Simply so you’ll be able to see.

Anyway, the TWFE estimator integrates from the smallest to highest doses, and when industries have doses under the imply, we’re in that zone to the left in that sort of orange hue-ish trying vary, all of that are unfavourable weights, and when the dose is above the imply, we’re above zero in that turquoise blue and that’s constructive weights.

And there you go! That’s the shiny app, plus two video walk-throughs, the primary one which you need to use to higher see how you can use Claude Code to make a shiny app, and the second was strolling you thru deciphering the shiny app. I hope you discovered this useful! Each for utilizing Claude Code to make a shiny app, and for deciphering the TWFE decomposition while you’re regression is in its “degree” formulation (which is able to most likely be more often than not tbh).

And naturally, if you happen to discovered this handy, think about changing into a paying subscriber!

From Threat to Asset: Designing a Sensible Knowledge Technique That Really Works

0


Most information platforms don’t fail with a giant bang
they slowly degrade and lose affect.

seems to be promising: dashboards are constructed, pipelines run, information turns into out there and groups begin exploring. However over time one thing shifts:

  • definitions and possession are unclear
  • the identical metric exhibits totally different numbers in numerous dashboards, individuals cease trusting information and don’t use options
  • change and selections take longer as an alternative of sooner and really feel dangerous
  • groups begin constructing their very own logic in isolation, spreading logic throughout methods

Nothing is “down” or technically damaged, but the group slowly loses management over how information is used.

On this article I define a sensible blueprint for constructing a information technique that helps you take again management and switch information into an asset as an alternative of a threat.


The Core of the Drawback

It’s straightforward to level at know-how: perhaps the platform isn’t proper, perhaps we want a knowledge lake, a brand new warehouse or higher tooling.

However in lots of instances that’s not the true downside.

The issue is just not the instruments. It’s the group missing a transparent technique to determine, personal, and use information persistently.

When that occurs, acquainted patterns emerge: definitions diverge, possession turns into unclear and logic spreads throughout dashboards, pipelines and advert hoc evaluation. Knowledge is just not trusted and stops behaving like a strategic asset an begins behaving like an organizational threat.

An information technique may also help clear up precisely that.


Why you want a knowledge technique

An information technique connects the very best stage of a corporation to essentially the most concrete selections. It hyperlinks imaginative and prescient to execution. This ensures that every one selections contribute to the group’s targets.

A very good information technique creates alignment throughout the group. It advantages each enterprise and IT.

  • it helps ensures that information work contributes to enterprise targets
  • it provides course for troublesome selections (e.g. database alternative)
  • it creates shared possession and accountability
  • it builds belief by making selections extra constant and traceable.

What’s a knowledge technique?

An information technique is just not a roadmap, listing of instruments or a group of greatest practices. It’s a chain that connects intention to motion:

An information technique defines how information is used to make selections, who’s accountable, and which trade-offs the group is keen to make to make information work.

In apply, a knowledge technique does two issues:

1 Outline rules (what issues)
Consider these because the guardrails. An instance might be “information is business-owned” or “definitions are shared”. These must be derived out of your (information) imaginative and prescient.

2 Outline decisions (what you do below constraints)
The alternatives are just like the trade-offs you make. Can we select strict governance or flexibility? Batch or real-time pipelines? Can we manage possession centrally or decentralized?

The rules outline the course, the technique emerges within the decisions you make.

So a robust information technique connects organizational course (imaginative and prescient, mission) to day-to-day implementation. Many organizations skip this hyperlink step and bounce straight from imaginative and prescient to e.g. instruments. They miss a necessary step in between, resulting in the aforementioned issues.

https://mikehuls.com/layered-architecture-for-building-readable-robust-and-extensible-apps


Constructing a method in three parts

Creating a method is fairly arduous as a result of it hyperlinks the summary world of imaginative and prescient and course to the sensible world of implementation. We’ll break the information technique into three parts:

Parts of a robust information technique (picture by creator)

Part 1: Course

Course defines what you optimize for.

Course defines what you optimize for (picture by creator)

Technique ought to at all times be grounded within the group itself; based mostly off of the group’s targets, imaginative and prescient and mission.
In case your information technique doesn’t clearly connect with your organizational imaginative and prescient, it’s not a method. It’s a group of initiatives.
We construct our information technique on prime of the information imaginative and prescient, which aligns with the group’s mission and imaginative and prescient:

Mission → Imaginative and prescient → Knowledge Imaginative and prescient → Knowledge Technique → Implementation

Let’s rapidly undergo every half.

1.1. Mission & imaginative and prescient (why you exist)

The mission describes why the group exists. It’s normally secure, long-term and infrequently adjustments.
A imaginative and prescient describes what success seems to be like and the place the group is making an attempt to go.

Instance (Electrical automotive firm):
Mission: “To speed up the world’s transition to sustainable power”
Imaginative and prescient: “To create essentially the most compelling automotive firm of the twenty first century by driving the world’s transition to electrical autos.”


1.2. Knowledge Imaginative and prescient

Defines the function of knowledge within the group and the way information helps the group’s targets. It builds the bridge between enterprise and information.

Instance (Electrical automotive firm):
“We operates with real-time, globally accessible information to allow speedy decision-making, optimize manufacturing and distribution, and speed up market growth.”


1.3. Knowledge technique (the way you make it occur)

Technique interprets course into decisions. The imaginative and prescient defines the course, the information technique defines the trade-offs.

That is the place selections are made about possession, governance, construction, and working mannequin, guided by the information imaginative and prescient.

Instance (electrical automotive firm)
“As a result of we prioritize quick, data-driven decision-making, we select real-time pipelines over batch processing, accepting greater complexity and value in change for pace and availability.”


Part 2: Construction

On this half we create a set of deliberate decisions impressed by the information imaginative and prescient. Collectively, these decisions type the core of the information technique.

Within the subsequent half we’ll undergo every of the themes, outline what they’re all about, listing signs or issues that this theme ought to sole and see some clear, sensible examples.

Technique emerges on this part (picture by creator)

You don’t essentially must “implement” these theme. Use them to stress-test your information technique. Use it so see the place we made specific decisions and the place we’re counting on assumptions.

These 5 themes should not the technique itself. The technique emerges from the alternatives which are made. Briefly:

These themes don’t outline your technique; they make it easier to see whether or not you even have one.

Express vs implicit decisions

If sure themes should not addressed explicitly, they nonetheless exist however emerge implicitly:

  • implicit governance → selections made informally
  • implicit definitions → tribal data as an alternative of shared that means
  • implicit possession → whoever shouts the loudest

That is the place factor begin to break down. Issues like these should not technical, however quite the results of lacking construction.


🧭 2.1 Alignment

How information connects to actual selections and enterprise worth.

This theme ought to encompass decisions that make sure that information is tied to make use of instances and concrete selections, and contributes on to enterprise targets. It ensures that information is used to make selections, not simply to supply info. With out this information turns into a technical train as an alternative of a enterprise asset.

What issues present up right here?

  • dashboards exist, however no person makes use of them
  • groups don’t know why sure information exists
  • information work is pushed by IT as an alternative of enterprise wants
  • unclear possession of metrics and outcomes

Examples of decisions that cowl this theme:

  • “We prioritize constructing information merchandise for particular selections, not generic datasets.”
    This trades greater adoption and affect for much less flexibility for advert hoc evaluation
  • “We assign possession of knowledge (definitions, that means) to enterprise domains.”
    Stronger accountability and relevance at the price of much less central management and standardization
  • “We give attention to a restricted variety of use instances that instantly affect enterprise outcomes.”
    Clear ROI and focus however some use instances are delayed decrease prio

🧱 2.2 Knowledge basis

Knowledge can not scale with out shared that means and consistency.

This theme covers decisions that make sure that information can be utilized throughout the group. Take into consideration shared and documented core definitions, persistently structured information and satisfactory metadata that explains what information means and the place it comes from.

What issues present up right here?

  • the identical metric has a number of definitions
  • groups argue about numbers as an alternative of utilizing them
  • information is difficult to grasp with out asking somebody
  • combining datasets results in inconsistencies

Examples of decisions that cowl this theme

  • “We outline key enterprise ideas (e.g. income, buyer) centrally and reuse them.”
    Consistency and belief at the price of pace of change and suppleness
  • “We implement constant modeling patterns throughout datasets.”
    Simpler collaboration and reuse however much less freedom for groups
  • “We make information self-explanatory by means of documentation and lineage.”
    Simpler onboarding and utilization however upfront effort and upkeep.

⚙️ 2.3 Operations

Reliability and day-to-day functioning of knowledge methods.

This theme ensures that e.g. pipelines are secure and monitored, information high quality is actively managed and safety and entry are managed. With out this information can’t be trusted, even when every part else is nicely designed.

What issues present up right here?

  • pipelines break or silently fail
  • information high quality points go unnoticed
  • numbers abruptly change with out clarification
  • entry to information is inconsistent or insecure

Examples:

  • “We construct validation and testing into pipelines as an alternative of fixing points afterward.”
    greater belief and reliability however extra upfront improvement effort
  • “We actively monitor pipelines and information high quality.”
    sooner difficulty detection however extra operational overhead
  • “We outline clear entry guidelines for delicate information.”
    Safety and compliance however diminished ease of entry

🚀 2.4 Evolvability

How simply does your information setup adapt to alter, development and innovation?

An information technique ought to make change simpler, not tougher.

Knowledge must be modular and reusable throughout groups and domains. We should always construct on current foundations, not reinvent them. Shared that means permits groups to mix and use information with out fixed translation. With out this, change is pricey, and progress slows down.

What issues present up right here?

  • each new use case requires rebuilding logic
  • groups duplicate work throughout domains
  • adjustments are gradual and dangerous
  • scaling information use turns into painful

Examples:

  • “We design information fashions to be reused throughout a number of use instances.”
    Sooner future improvement however extra upfront design effort
  • “We construct loosely coupled parts that may evolve independently.”
    flexibility and scalability however enhance design complexity
  • “We put money into shared that means (e.g. semantic layers, ontologies).”
    interoperability throughout groups however governance and coordination effort

🏛️ 2.5 Governance

How selections about information are made and who’s accountable.

This theme is about clearly defining possession, specific decision-making processes and points which are tracked and resolved. You create construction in who owns a definition, who decides when one thing adjustments and how priorities are decided.With out governance selections turn out to be inconsistent and points stay unresolved.

What issues present up right here?

  • no person is aware of who owns a dataset or definition
  • adjustments occur with out coordination
  • information points stay unresolved
  • priorities are unclear

Examples:

  • “We assign clear homeowners for information domains and definitions.”
    Accountability however dependency on people
  • “We outline how adjustments to information are proposed and authorised.”
    Consistency and management however slower determination cycles
  • “We observe and prioritize information points transparently.”
    Higher prioritization and determination however further overhead course of

Part 3: Execution

A technique solely issues if it turns into actuality. This part is the place we transfer from intention to operation.

Stipulations for execution (picture by creator)

That is the place many methods fail: they appear good on paper however there isn’t any concrete implementation plan that helps us embed the technique in how the group really works.

A sensible technique to design execution is thru three dimensions:

  • Folks → who’s accountable
  • Course of → the way it works
  • Know-how → what helps it

If a strategic alternative is just not mirrored in individuals, course of, and know-how, it doesn’t exist.

Instance: business-owned information

One a part of your technique might be:

“We wish information to be owned by the enterprise.”

We make this assertion actual by defining what wants to truly occur in the true world; e.g:

  • Folks → assign information homeowners inside enterprise domains
  • Course of → outline possession workflows (definition adjustments, difficulty dealing with, prioritization)
  • Know-how → allow visibility (information catalogs, lineage, entry management)

Solely when all three are in place does possession really exist.


Why this issues

Execution forces readability. It exposes gaps corresponding to:

  • possession with out accountability
  • processes with out accountability
  • instruments with out adoption

It additionally reveals trade-offs like centralized vs decentralized possession, pace vs management and suppleness vs standardization

Along with simply implementation, the execution part can be a technique to validate your technique.

For each strategic alternative, you need to be capable of reply:

  • Who owns it? (Folks)
  • How does it work? (Course of)
  • What helps it? (Know-how)

If one is lacking, the technique is incomplete.


Conclusion

An information technique is a series that connects intention to motion. We’ve damaged this down in three parts:

Three parts of designing a knowledge technique (picture by creator)
  • Course ensures that information contributes to what really issues, with out course you construct the mistaken issues.
  • Construction ensures that the appropriate circumstances are in place. With out it, issues don’t scale or keep constant.
  • Execution ensures that these circumstances turn out to be actuality, with out this nothing really adjustments.

When all three parts are aligned, selections turn out to be sooner, change turns into safer and belief will increase.

When information behaves like an asset as an alternative of a threat you could have a knowledge technique that works.


I hope this text was as clear as I supposed it to be but when this isn’t the case please let me know what I can do to make clear additional. Within the meantime, take a look at my different articles on every kind of programming-related subjects.

Completely satisfied coding!

— Mike

P.s: like what I’m doing? Comply with me!

Microsoft pulls service replace inflicting Groups launch failures

0


Microsoft has reverted a latest service replace that was stopping some clients from launching the Microsoft Groups desktop shopper.

Affected customers are getting caught on the loading display screen and seeing the “We’re having hassle loading your message. Strive refreshing.” error message.

On Friday morning, after acknowledging the incident (tracked beneath TM1283300), Microsoft stated the launch failures had been as a consequence of a transient situation within the service infrastructure that triggered some older Microsoft Groups desktop shopper builds to “enter an unhealthy state.”

Wiz

“We have confirmed that our automated restoration system has efficiently remediated influence and we’re reaching out to your representatives to validate this situation is absolutely resolved for all customers,” Microsoft stated.

Three hours later, Microsoft reverted the buggy service replace to handle the difficulty, including that the difficulty was attributable to “a regression throughout the Microsoft Groups shopper construct caching system.”

Impacted Groups customers are actually suggested to completely stop and restart their Groups purchasers to make sure that the repair propagates to their techniques.

“Now that the replace that launched the regression has been absolutely reverted, a restart will probably be wanted through which customers absolutely stop after which restart Groups in order that our resolution propagates,” Microsoft added within the newest replace to the message heart.

“We’re persevering with to await suggestions from the subset of impacted customers and monitoring our service telemetry to substantiate the difficulty is resolved after we have accomplished the aforementioned reversion.”

Whereas Microsoft did not share what number of customers or which areas are affected by this situation, it flagged the service outage as an incident, which generally applies to essential service points and noticeable consumer influence.

Final month, it resolved one other recognized situation that triggered launch failures in older builds of the Traditional Outlook e mail shopper, rendering it unusable for customers who had enabled the most recent model of the Microsoft Groups Assembly Add-in.

One week earlier, it launched out-of-band updates to repair a serious situation that broke sign-ins with Microsoft accounts throughout a number of Microsoft apps, together with Groups purchasers.

Over the weekend, Microsoft additionally launched a set of emergency updates to handle recognized Home windows Server points, inflicting safety replace set up issues and area controllers to enter a restart loop.

AI chained 4 zero-days into one exploit that bypassed each renderer and OS sandboxes. A wave of latest exploits is coming.

On the Autonomous Validation Summit (Could 12 & 14), see how autonomous, context-rich validation finds what’s exploitable, proves controls maintain, and closes the remediation loop.

Altar to Sol: A uncommon 1,900-year-old monument devoted to the Roman god of sunshine and utilized in a secret underground ritual

0


The altar to Sol was pierced from behind in order that gentle may shine by way of. (Picture credit score: © Nationwide Museums Scotland)

QUICK FACTS

Title: Altar to Sol

What it’s: A carved sandstone altar

The place it’s from: Inveresk, Scotland

When it was made: Second century

Mastering the uninteresting actuality of attractive AI

0

Constructing the constructing blocks

What do I imply by “engineering functionality”? I undoubtedly don’t imply mannequin entry. Most everybody has that—or quickly will. No, I imply the sensible disciplines that flip a mannequin right into a system: knowledge modeling, retrieval, analysis, permissions, observability, and reminiscence. You recognize, the unsexy, “boring” stuff that makes enterprise tasks, notably enterprise AI tasks, succeed.

This knowledgeable how my crew constructed our workshops. We didn’t begin with “right here’s how one can construct an autonomous worker.” We began with the AI knowledge layer: heterogeneous knowledge, a number of representations, embeddings, vector indexes, hybrid retrieval, and the trade-offs amongst totally different knowledge varieties (relational, doc, and so forth.). In different phrases, we began with the stuff most AI advertising tries to skip. A lot of the AI world appears to suppose AI begins with a immediate when it really begins with issues like multimodel schema design, vector technology, indexing, and hybrid retrieval.

That issues as a result of enterprise knowledge isn’t tidy. It lives in tables, PDFs, tickets, dashboards, row-level insurance policies, and 20 years of organizational improvisation. In the event you don’t know how one can mannequin that mess for retrieval, you gained’t have enterprise AI. You’ll merely obtain a elegant autocomplete system. As I’ve identified, the arduous half isn’t getting a mannequin to sound good. It’s getting the mannequin to work contained in the bizarre, company-specific actuality the place precise choices are made.